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Abstract 

We provide formal definitions and efficient secure techniques for 

• turning noisy information into keys usable for any cryptographic application, and, in particular, 

• reliably and securely authenticating biometric data. 

Our techniques apply not just to biometric information, but to any keying material that, unlike tradi- 
tional cryptographic keys, is ( 1 ) not reproducible precisely and (2) not distributed uniformly. We propose 
two primitives: a fuzzy extractor reliably extracts nearly uniform randomness R from its input; the ex- 
traction is error-tolerant in the sense that R will be the same even if the input changes, as long as it 
remains reasonably close to the original. Thus, R can be used as a key in a cryptographic application. 
A secure sketch produces public information about its input w that does not reveal w, and yet allows 
exact recovery of w given another value that is close to w. Thus, it can be used to reliably reproduce 
error-prone biometric inputs without incurring the security risk inherent in storing them. 

We define the primitives to be both formally secure and versatile, generalizing much prior work. In 
addition, we provide nearly optimal constructions of both primitives for various measures of "closeness" 
of input data, such as Hamming distance, edit distance, and set difference. 
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1 Introduction 



Cryptography traditionally relies on uniformly distributed and precisely reproducible random strings for its 
secrets. Reality, however, makes it difficult to create, store, and reliably retrieve such strings. Strings that 
are neither uniformly random nor reUably reproducible seem to be more plentiful. For example, a random 
person's fingerprint or iris scan is clearly not a uniform random string, nor does it get reproduced precisely 
each time it is measured. Similarly, a long pass-phrase (or answers to 15 questions HFJOll or a list of favorite 
movies IIJS06I ) is not uniformly random and is difficult to remember for a human user. This work is about 
using such nonuniform and unreliable secrets in cryptographic applications. Our approach is rigorous and 
general, and our results have both theoretical and practical value. 

To illustrate the use of random strings on a simple example, let us consider the task of password authen- 
tication. A user Alice has a password w and wants to gain access to her account. A trusted server stores 
some information y = f{w) about the password. When Alice enters w, the server lets Alice in only if 
f{w) = y. In this simple application, we assume that it is safe for Alice to enter the password for the veri- 
fication. However, the server's long-term storage is not assumed to be secure (e.g., y is stored in a publicly 
readable /etc/pas swd file in UNIX IIMT79II ). The goal, then, is to design an efficient / that is hard to 
invert (i.e., given y it is hard to find w' such that f{w') = y), so that no one can figure out Alice's password 
from y. Recall that such functions / are called one-way functions. 

Unfortunately, the solution above has several problems when used with passwords w available in real 
life. First, the definition of a one-way function assumes that w is truly uniform and guarantees nothing if 
this is not the case. However, human-generated and biometric passwords are far from uniform, although 
they do have some unpredictability in them. Second, Alice has to reproduce her password exactly each 
time she authenticates herself. This restriction severely limits the kinds of passwords that can be used. 
Indeed, a human can precisely memorize and reliably type in only relatively short passwords, which do not 
provide an adequate level of security. Greater levels of security are achieved by longer human-generated and 
biometric passwords, such as pass-phrases, answers to questionnaires, handwritten signatures, fingerprints, 
retina scans, voice commands, and other values selected by humans or provided by nature, possibly in 
combination (see | |FryOO[ for a survey). These measurements seem to contain much more entropy than 
human-memorizable passwords. However, two biometric readings are rarely identical, even though they are 
likely to be close; similarly, humans are unlikely to precisely remember their answers to multiple questions 
from time to time, though such answers will likely be similar. In other words, the ability to tolerate a 
(limited) number of errors in the password while retaining security is crucial if we are to obtain greater 
security than provided by typical user-chosen short passwords. 

The password authentication described above is just one example of a cryptographic application where 
the issues of nonuniformity and eiTor-tolerance naturally come up. Other examples include any crypto- 
graphic application, such as encryption, signatures, or identification, where the secret key comes in the form 
of noisy nonuniform data. 

Our Definitions. As discussed above, an important general problem is to convert noisy nonuniform 
inputs into reliably reproducible, uniformly random strings. To this end, we propose a new primitive, termed 
fuzzy extractor. It extracts a uniformly random string R from its input w inn noise-tolerant way. Noise- 
tolerance means that if the input changes to some w' but remains close, the string R can be reproduced 
exactly. To assist in reproducing R from w' , the fuzzy extractor outputs a nonsecret string P. It is important 
to note that R remains uniformly random even given P. (Strictly speaking, R will be e-close to uniform 
rather than uniform; e can be made exponentially small, which makes R as good as uniform for the usual 
applications.) 
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Figure 1: (a) secure sketch; (b) fuzzy extractor; (c) a sample application: user who encrypts a sensitive 
record using a cryptographically strong, uniform key R extracted from biometric w via a fuzzy extractor; 
both P and the encrypted record need not be kept secret, because no one can decrypt the record without a 
w' that is close. 



Our approach is general: R extracted from w can be used as a key in a cryptographic application but 
unUke traditional keys, need not be stored (because it can be recovered from any w' that is close to w). We 
define fuzzy extractors to be infonnation-theoretically secure, thus allowing them to be used in cryptographic 
systems without introducing additional assumptions (of course, the cryptographic application itself will 
typically have computational, rather than information- theoretic, security). 

For a concrete example of how to use fuzzy extractors, in the password authentication case, the server 
can store (P, f{R)). When the user inputs w' close to w, the server reproduces the actual R using P and 
checks if f{R) matches what it stores. The presence of P will help the adversary invert f{R) only by the 
additive amount of e, because R is e-close to uniform even given Pn Similarly, R can be used for symmetric 
encryption, for generating a public-secret key pair, or for other applications that utilize uniformly random 
secrets H 

As a step in constructing fuzzy extractors, and as an interesting object in its own right, we propose 
another primitive, termed secure sketch. It allows precise reconstruction of a noisy input, as follows: on 
input w, a procedure outputs a sketch s. Then, given s and a value w' close to w, it is possible to recover w. 
The sketch is secure in the sense that it does not reveal much about w: w retains much of its entropy even 
if s is known. Thus, instead of storing w for fear that later readings will be noisy, it is possible to store s 
instead, without compromising the privacy of w. A secure sketch, unlike a fuzzy extractor, allows for the 
precise reproduction of the original input, but does not address nonuniformity. 

' To be precise, we should note that because we do not require w, and hence P, to be efficiently samplable, we need / to be a 
one-way function even in the presence of samples from w; this is implied by security against circuit families. 

^ Naturally, the security of the resulting system should be properly defined and proven and will depend on the possible adversarial 
attacks. In particular, in this work we do not consider active attacks on P or scenarios in which the adversa ry can force multiple 
invocations of the extractor with related w and gets to observe the different P values. See |Boy04[[BDK"'"05l|DKRS06l for follow- 
up work that considers attacks on the fuzzy extractor itself. 
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Secure sketches, fuzzy extractors and a sample encryption application are illustrated in Figure [T] 
Secure sketches and extractors can be viewed as providing fuzzy key storage: they allow recovery of the 
secret key (w or R) from a faulty reading w' of the password w by using some public information (s or P). In 
particular, fuzzy extractors can be viewed as error- and nonuniformity-tolerant secret key key-encapsulation 
mechanisms UShoOlll . 

Because different biometric information has different error patterns, we do not assume any particular 
notion of closeness between w' and w. Rather, in defining our primitives, we simply assume that w comes 
from some metric space, and that w' is no more than a certain distance from w in that space. We consider 
particular metrics only when building concrete constructions. 

General Results. Before proceeding to construct our primitives for concrete metrics, we make some 
observations about our definitions. We demonstrate that fuzzy extractors can be built out of secure sketches 
by utilizing strong randomness extractors IINZ96I . such as, for example, universal hash functions IICW79I 
IWC81II (randomness extractors, defined more precisely below, are families of hash which "convert" a high 
entropy input into a shorter, uniformly distributed output). We also provide a general technique for con- 
structing secure sketches from transitive families of isometrics, which is instantiated in concrete construc- 
tions later in the paper. Finally, we define a notion of a biometric embedding of one metric space into another 
and show that the existence of a fuzzy extractor in the target space, combined with a biometric embedding 
of the source into the target, implies the existence of a fuzzy extractor in the source space. 
These general results help us in building and analyzing our constructions. 

Our Constructions. We provide constructions of secure sketches and fuzzy extractors in three metrics: 
Hamming distance, set difference, and edit distance. Unless stated otherwise, all the constructions are new. 

Hamming distance (i.e., the number of symbol positions that differ between w and w') is perhaps the 
most natural metric to consider. We observe that the "fuzzy-commitment" construction of Juels and Wat- 
tenberg IIJW99II based on error-correcting codes can be viewed as a (nearly optimal) secure sketch. We then 
apply our general result to convert it into a nearly optimal fuzzy extractor. While our results on the Ham- 
ming distance essentially use previously known constructions, they serve as an important stepping stone for 
the rest of the work. 

The set difference metric (i.e., size of the symmetric difference of two input sets w and w') is appropriate 
whenever the noisy input is represented as a subset of features from a universe of possible featureso We 
demonstrate the existence of optimal (with respect to entropy loss) secure sketches and fuzzy extractors for 
this metric. However, this result is mainly of theoretical interest, because (1) it relies on optimal constant- 
weight codes, which we do not know how to construct, and (2) it produces sketches of length proportional 
to the universe size. We then turn our attention to more efficient constructions for this metric in order to 
handle exponentially large universes. We provide two such constructions. 

First, we observe that the "fuzzy vault" construction of Juels and Sudan IIJS06II can be viewed as a secure 
sketch in this metric (and then converted to a fuzzy extractor using our general result). We provide a new, 
simpler analysis for this construction, which bounds the entropy lost from w given s. This bound is quite 
high unless one makes the size of the output s very large. We then improve the Juels-Sudan construction to 
reduce the entropy loss and the length of s to near optimal. Our improvement in the running time and in the 
length of s is exponential for large universe sizes. However, this improved Juels-Sudan construction retains 
a drawback of the original: it is able to handle only sets of the same fixed size (in particular, \ w'\ must equal 

''a perhaps unexpected application of the set difference metric was explored in IIJS06I : a user would like to encrypt a file (e.g., 
her phone number) using a small subset of values from a large universe (e.g., her favorite movies) in such a way that those and only 
those with a similar subset (e.g., similar taste in movies) can decrypt it. 
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Second, we provide an entirely different construction, called PinSketch, that maintains the exponential 
improvements in sketch size and running time and also handles variable set size. To obtain it, we note that 
in the case of a small universe, a set can be simply encoded as its characteristic vector (1 if an element is 
in the set, if it is not), and set difference becomes Hamming distance. Even though the length of such a 
vector becomes unmanageable as the universe size grows, we demonstrate that this approach can be made 
to work quite efficiently even for exponentially large universes (in particular, because it is not necessary to 
ever actually write down the vector). This involves a result that may be of independent interest: we show 
that BCH codes can be decoded in time polynomial in the weight of the received corrupted word (i.e., in 
sublinear time if the weight is small). 

Finally, edit distance (i.e., the number of insertions and deletions needed to convert one string into the 
other) comes up, for example, when the password is entered as a string, due to typing errors or mistakes 
made in handwriting recognition. We discuss two approaches for secure sketches and fuzzy extractors for 
this metric. First, we observe that a recent low-distortion embedding of Ostrovsky and Rabani IIQR05II 
immediately gives a construction for edit distance. The construction performs well when the number of 
errors to be corrected is very small (say for a < 1) but cannot tolerate a large number of errors. Second, 
we give a biometric embedding (which is less demanding than a low-distortion embedding, but suffices for 
obtaining fuzzy extractors) from the edit distance metric into the set difference metric. Composing it with a 
fuzzy extractor for set difference gives a different construction for edit distance, which does better when t is 
large; it can handle as many as 0{n/ log^ n) errors with meaningful entropy loss. 

Most of the above constructions are quite practical; some implementations are available IIHJR06I . 

Extending Results for Probabilistic Notions of Correctness. The definitions and construc- 
tions just described use a very strong error model: we require that secure sketches and fuzzy extractors 
accept every secret w' which is sufficiently close to the original secret w, with probability 1. Such a strin- 
gent model is useful, as it makes no assumptions on the stochastic and computational properties of the eiTor 
process. However, slightly relaxing the error conditions allows constructions which tolerate a (provably) 
much larger number of errors, at the price of restricting the settings in which the constructions can be ap- 
plied. In Section [H we extend the definitions and constructions of earlier sections to several relaxed error 
models. 

It is well-known that in the standard setting of error-correction for a binary communication channel, 
one can tolerate many more errors when the errors are random and independent than when the errors are 
determined adversarially. In contrast, we present fuzzy extractors that meet Shannon's bounds for correcting 
random en^ors and, moreover, can correct the same number of enws even when errors are adversarial. In our 
setting, therefore, under a proper relaxation of the correctness condition, adversarial eiTors are no stronger 
than random ones. The constructions are quite simple and draw on existing techniques from the coding 
literature IIBBR88[ iDGLOl IGur03 1 ILan04[ IMPS W05II . 

Relation to Previous Work. Since our work combines elements of error correction, randomness 
extraction and password authentication, there has been a lot of related work. 

The need to deal with nonuniform and low-entropy passwords has long been realized in the security 
community, and many approaches have been proposed. For example, Kelsey et al. IIKSHW97I suggested 
using f{w,r) in place of w for the password authentication scenario, where r is a public random "salt," 
to make a brute-force attacker's life harder. While practically useful, this approach does not add any en- 
tropy to the password and does not formally address the needed properties of /. Another approach, more 
closely related to ours, is to add biometric features to the password. For example, Ellison et al. HEHMSOOII 
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proposed asking the user a series of n personalized questions and using these answers to encrypt the "ac- 
tual" truly random secret R. A similar approach using the user's keyboard dynamics (and, subsequently, 
voice HMRLWO 1 a[ IMRLWOTbl ) was proposed by Monrose et al. IIMRW99I . These approaches require the 
design of a secure "fuzzy encryption." The above works proposed heuristic designs (using various forms of 
Shamir's secret sharing), but gave no formal analysis. Additionally, en^or tolerance was addressed only by 
brute force search. 

A formal approach to error tolerance in biometrics was taken by Juels and Wattenberg IIJW99I (for 
less formal solutions, see IIDFMP991 IMRW991 lEHMSOOl ). who provided a simple way to tolerate enws 
in uniformly distributed passwords. Frykholm and Juels BFJOlll extended this solution and provided en- 
tropy analysis to which ours is similar. Similar approaches have been explored earlier in seemingly unre- 
lated literature on cryptographic information reconciliation, often in the context of quantum cryptography 
(where Alice and Bob wish to derive a secret key from secrets that have small Hamming distance), particu- 
larly IIBBR881lBBCS9lt . Our construction for the Hamming distance is essentially the same as a component 
of the quantum oblivious transfer protocol of IIBBCS91I . 

Juels and Sudan IIJS06I provided the first construction for a metric other than Hamming: they con- 
structed a "fuzzy vault" scheme for the set difference metric. The main difference is that IIJS06II lacks a 
cryptographically strong definition of the object constructed. In particular, their construction leaks a signifi- 
cant amount of information about their analog of R, even though it leaves the adversary with provably "many 
valid choices" for R. In retrospect, their informal notion is closely related to our secure sketches. Our con- 
structions in Section |6] improve exponentially over the construction of [iJS06l for storage and computation 
costs, in the setting when the set elements come from a large universe. 

Linnartz and Tuyls IILT03II defined and constructed a primitive very similar to a fuzzy extractor (that 
line of work was continued in IIVTDL03I .) The definition of IILT03II focuses on the continuous space 
and assumes a particular input distribution (typically a known, multivariate Gaussian). Thus, our definition 
of a fuzzy extractor can be viewed as a generalization of the notion of a "shielding function" from IILT03II . 
However, our constructions focus on discrete metric spaces. 

Other approaches have also been taken for guaranteeing the privacy of noisy data. Csirmaz and Katona 
IICK03II considered quantization for correcting errors in "physical random functions." (This corresponds 
roughly to secure sketches with no public storage.) Barral, Coron and Naccache IIBCN04II proposed a 
system for offline, private comparison of fingerprints. Although seemingly similar, the problem they study 
is complementary to ours, and the two solutions can be combined to yield systems which enjoy the benefits 
of both. 

Work on privacy amplification, e.g., IIBBR881lBBCM95l . as well as work on derandomization and hard- 
ness amplification, e.g., IIHILL99I INZ96II . also addressed the need to extract uniform randomness from a 
random variable about which some information has been leaked. A major focus of follow-up research has 
been the development of (ordinary, not fuzzy) extractors with short seeds (see IISha02ll for a survey). We 
use extractors in this work (though for our purposes, universal hashing is sufficient). Conversely, our work 
has been applied recently to privacy amplification: Ding MDinOSII used fuzzy extractors for noise tolerance 
in Maurer's bounded storage model IIMau93ll . 

Independently of our work, similar techniques appeared in the literature on noncryptographic informa- 
tion reconciliation IIMTZ03I ICT04II (where the goal is communication efficiency rather than secrecy). The 
relationship between secure sketches and efficient information reconciliation is explored further in Section|9j 
which discusses, in particular, how our secure sketches for set differences provide more efficient solutions 
to the set and string reconciliation problems. 

Follow-up Work. Since the original presentation of this paper IIDRS04I . several follow-up works have 
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appeared (e.g., poy04[ iBDK+OSl IDSMI lDORS06[ ISmiOTl ICL06l ILSM06[ ICFL06II ). We refer the reader to 
a recent survey about fuzzy extractors IIDRS07II for more information. 



2 Preliminaries 

Unless explicitly stated otherwise, all logarithms below are base 2. The Hamming weight (or just weight) 
of a string is the number of nonzero characters in it. We use Ue to denote the uniform distribution on £-hit 
binary strings. If an algorithm (or a function) / is randomized, we use the semicolon when we wish to make 
the randomness explicit: i.e., we denote by /(x; r) the result of computing / on input x with randomness 
r. If X is a probability distribution, then f{X) is the distribution induced on the image of / by applying 
the (possibly probabilistic) function /. If X is a random variable, we will (slightly) abuse notation and also 
denote by X the probability distribution on the range of the variable. 

2.1 Metric Spaces 

A metric space is a set A4 with a distance function dis : x ^ M+ = [0, oo). For the purposes of 
this work, M will always be a finite set, and the distance function only take on only integer values (with 
dis(x, y) = if and only if x = y) and will obey symmetry dis(x, y) = dis(y, x) and the triangle inequality 
dis(x, z) < dis(x, y) + dis(y, z) (we adopt these requirements for simplicity of exposition, even though the 
definitions and most of the results below can be generalized to remove these restrictions). 
We will concentrate on the following metrics. 

1. Hamming metric. Here M = J^" for some alphabet J^, and dis(i(;, w') is the number of positions in 
which the strings w and w' differ. 

2. Set difference metric. Here M consists of all subsets of a universe U. For two sets w,w', their 

symmetric difference wAw' '^{x£w[Jw'\x^wri w'}. The distance between two sets w, w' is 
|t(;At(;'|.0 We will sometimes restrict M to contain only s-element subsets for some s. 

3. Edit metric. Here M = T*, and the distance between w and w' is defined to be the smallest num- 
ber of character insertions and deletions needed to transform w into w'. H (This is different from 
the Hamming metric because insertions and deletions shift the characters that ai^e to the right of the 
insertion/deletion point.) 

As already mentioned, all three metrics seem natural for biometric data. 

2.2 Codes and Syndromes 

Since we want to achieve eiTor tolerance in various metric spaces, we will use error-correcting codes for 
a particular- metric. A code C is a subset {wq, . . . , wk-i} of K elements of M. The map from i to m/, 
which we will also sometimes denote by C, is called encoding. The minimum distance of C is the smallest 
d > such that for all i j we have d\s{wi,Wj) > d. In our case of integer metrics, this means that one 

''in the preliminary version of this work | iDRS04| , we worked with this metric scaled by i; that is, the distance was ^\wAw'\. 
Not scaling makes more sense, particularly when w and w' are of potentially different sizes since |uiAu/| may be odd. It also 
agrees with the hamming distance of characteristic vectors; see Section[6] 

'Again, in IDRS04I . we worked with this metric scaled by i. Likewise, this makes little sense when strings can be of different 
lengths, and we avoid it here. 
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can detect up to {d — 1) "eiTors" in an element of Ai. The error-correcting distance of C is the largest 
number t > such that for every w £ Ai there exists at most one codeword c in the ball of radius t around 
w: d\5{w, a) < t for at most one c G C. This means that one can correct up to t errors in an element w of 
Ai; we will use the term decoding for the map that finds, given w, the c G C such that dis(i(;, c) < t (note 
that for some w, such c may not exist, but if it exists, it will be unique; note also that decoding is not the 
inverse of encoding in our terminology). For integer metrics by triangle inequality we are guaranteed that 
i > ~ ■ Since error correction will be more important than error detection in our applications, we 
denote the corresponding codes as {M., K, t)-codes. For efficiency purposes, we will often want encoding 
and decoding to be polynomial-time. 

For the Hamming metric over J^, we will sometimes call k = log|jF| K the dimension of the code and 
denote the code itself as an [n, k,d = 2t+l\j^-code, following the standard notation in the literature. We will 
denote by A^-p^ (n, d) the maximum K possible in such a code (omitting the subscript when |jr| = 2), and 
by A{n, d, s) the maximum K for such a code over {0, 1}" with the additional restriction that all codewords 
have exactly s ones. 

If the code is linear (i.e., ^ is a field, J^" is a vector space over JT, and C is a linear subspace), then 
one can fix a parity-check matrix H as any matrix whose rows generate the orthogonal space . Then 

for any v G JP*, the syndrome syn(7;) =^ Hv. The syndrome of a vector is its projection onto subspace 
that is orthogonal to the code and can thus be intuitively viewed as the vector modulo the code. Note that 
V £ C ^ 5yn{v) = 0. Note also that H is an {n — k) x n matrix and that syn(?;) is n — A; bits long. 

The syndrome captures all the information necessary for decoding. That is, suppose a codeword c is 
sent through a channel and the word w = c + eis received. First, the syndrome of w is the syndrome of e: 
syn(u;) = syn(c) + syn(e) = + syn(e) = syn(e). Moreover, for any value u, there is at most one word e 
of weight less than d/2 such that syn(e) = u (because the existence of a pair of distinct words ei, 62 would 
mean that ei — e2 is a codeword of weight less than d, but since 0" is also a codeword and the minimum 
distance of the code is d, this is impossible). Thus, knowing syndrome syn{w) is enough to determine the 
error pattern e if not too many errors occurred. 

2.3 Min-Entropy, Statistical Distance, Universal Hashing, and Strong Extractors 

When discussing security, one is often interested in the probability that the adversary predicts a random 
value (e.g., guesses a secret key). The adversary's best strategy, of course, is to guess the most likely value. 
Thus, predictability of a random variable A is max^ Pt[A = a], and, correspondingly, min-entropy Hoo(^) 
is — log(maXaPr[^ = a]) (min-entropy can thus be viewed as the "worst-case" entropy IICG88II : see also 
Section 1231). 

The min-entropy of a distribution tells us how many nearly uniform random bits can be extracted from it. 
The notion of "nearly" is defined as follows. The statistical distance between two probability distributions 
A and B is SD {A, B) = \ Y.v I Pr(^ = v) - Vt{B = v)\. 

Recall the definition of strong randomness extractors IINZ96II . 

Definition 1. Let Ext : {0, 1}" — > {0, 1}^ be a polynomial time probabilistic function which uses r bits of 
randomness. We say that Ext is an efficient (n, m, £, e)-strong extractor if for all min-entropy m distributions 
W on {0, 1}", SD ((Ext(W^; X),X), {Ue, X)) < e, where X is uniform on {0, l}''. 

Strong extractors can extract at most i = m — 2 log (^) + 0(1) nearly random bits URTSOOl . Many 
constructions match this bound (see Shaltiel's survey IISha02ll for references). Extractor constructions are 
often complex since they seek to minimize the length of the seed X. For our purposes, the length of X will 
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be less important, so universal hash functions IICW79I IWC81II (defined in the lemma below) will already 
give us the optimal i = m — 2 log (^) + 2, as given by the leftover hash lemma below (see IIHILL99I Lemma 
4.8] as well as references therein for earlier versions): 

Lemma 2.1 (Universal Hash Functions and the Leftover-Hash / Privacy-Amplification Lemma). Assume a 
family of functions {H^ : {0, 1}" {0, Ij^lxG.'C''* universal; for all a b e {0, 1}", FTxex[Hx{a) = 
Hx{b)] = 2~^. Then, for any random variable WjQ 

SB{{Hx{W),X), ([/,,X))<i\/2-H^W2^ (1) 

In particular, universal hash functions are (n, m, i, e)-strong extractors whenever £ < m — 2 log (^) + 2. 

2.4 Average Min-Entropy 

Recall that predictability of a random variable A is maXaPr[^ = a], and its min-entropy Hoo(^) is 
— log(maXa Pr[A = a]). Consider now a pair of (possibly coiTclated) random variables A,B. If the 
adversary finds out the value b of B, then predictability of A becomes maX(jPr[A = a \ B = b]. On 
average, the adversary's chance of success in predicting A is then K^^b [maxa Pr[yl = a \ B = b]]. Note 
that we are taking the average over B (which is not under adversarial control), but the worst case over A 
(because prediction of A is adversarial once b is known). Again, it is convenient to talk about security in 
log-scale, which is why we define the average min-entropy of A given B as simply the logarithm of the 
above: 



Hoo(^ I B] 



def 



log E, 



maxPr[^ = a \ B = b] 



log E, 



2-Hoo(A|S= 



Because other notions of entropy have been studied in cryptographic literature, a few words are in order 
to explain why this definition is useful. Note the importance of taking the logarithm after taking the average 
(in contrast, for instance, to conditional Shannon entropy). One may think it more natural to define average 
min-entropy as E^^b [Hoo(^ I ^ = ^)]' thus reversing the order of log and E. However, this notion is 
unlikely to be useful in a security application. For a simple example, consider the case when A and B are 
1000-bit strings distributed as follows: B = f/iooo and A is equal to the value b of B if the first bit of b is 
0, and ?7iooo (independent of B) otherwise. Then for half of the values of b, Hqo {A \ B = b) = 0, while 
for the other half, Hoo(^ | -B = 6) = 1000, so E^^b [Hoo(^ \ B = b)] = 500. However, it would be 
obviously incorrect to say that A has 500 bits of security. In fact, an adversary who knows the value b of B 
has a slightly greater than 50% chance of predicting the value of A by outputting b. Our definition correctly 
captures this 50% chance of prediction, because Hoo(^ | B) is slightly less than 1. In fact, our definition of 
average min-entropy is simply the logarithm of predictability. 

The following useful properties of average min-entropy are proven in Appendix El We also refer the 
reader to Appendix |B] for a generalization of average min-entropy and a discussion of the relationship be- 
tween this notion and other notions of entropy. 

Lemma 2.2. Let A,B,C be random variables. Then 

(a) For any 5 > 0, the conditional entropy Hoo(^|-B = b) is at least Hoo(^l-B) — log(l/(5) with proba- 
bility at least 1 — 5 over the choice of b. 



*In IHILL99I , this inequality is formulated in terms of Renyi entropy of order two of W; the change to Hoo(C) is allowed 
because the latter is no greater than the former. 
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(b) If B has at most 2^ possible values, thenHooiA \ {B,C)) > Hoo((^,5) | C)-A > Hoo(^ | C)-X. 
In particular, Hoo(^ | B) > Hoo((^, B)) - X> Hoo(A) - A. 

2.5 Average- Case Extractors 

Recall from Definition [U that a strong extractor allows one to extract almost all the min-entropy from some 
nonuniform random variable W. In many situations, W represents the adversary's uncertainty about some 
secret w conditioned on some side information i. Since this side information i is often probabilistic, we 
shall find the following generalization of a strong extractor useful (see Lemma |4TT] ). 

Definition 2. Let Ext : {0, 1}" {0, 1}^ be a polynomial time probabilistic function which uses r 
bits of randomness. We say that Ext is an efficient average-case (n, m, e)-strong extractor if for all 
pairs of random variables {W,I) such that W is an n-bit string satisfying Hoo(VF \ I) > m, we have 
SD {{Ext{W; X),X, /), {Ui, X, I)) < e, where X is uniform on {0, 1}''. 

To distinguish the strong extractors of Definition [T] from average-case strong extractors, we will some- 
times call the former worst-case strong extractors. The two notions are closely related, as can be seen from 
the following simple application of Lemma I2.2r a). 

Lemma 2.3. For any 5 > 0, if Ext is a (worst-case) (n, m — log (^) , £, e)-strong extractor, then Ext is also 
an average-case (n, m, ^, e + S)-strong extractor. 

Proof. Assume iW, I) are such that Hoo(M^ | I) > m. Let Wi = {W \ I = i) and let us call the value i 
"bad" if Hoo(VFj) < m — log (^). Otherwise, we say that i is "good". By Lemma IZill a). Pr(i is bad) < 5. 
Also, for any good i, we have that Ext extracts I bits that are e-close to uniform from W^. Thus, by 
conditioning on the "goodness" of /, we get 

SD((Ext(VF;X),X,/),(?7£,X,/)) = Pr(i) • SD ((Ext(l^i; X), X), X)) 

i 

< Pr(i is bad) -l + Y^ Pr(z) • SD {{Ext{Wi; X) , X) , {Ue, X)) 

good i 

< 6 + e 

□ 

However, for many strong extractors we do not have to suffer this additional dependence on 6, because 
the strong extractor may be already average-case. In particular, this holds for extractors obtained via univer- 
sal hashing. 

Lemma 2.4 (Generalized Leftover Hash Lemma). Assume {Hx : {0, 1}" — > {0, Ij^jj^gx i^ a family of 
universal hash functions. Then, for any random variables W and I, 

SB{{Hx{W),X,I), ([/,,X,/))<iV2-H^W^)2^ (2) 

In particular, universal hash functions are average-case {n,m,l,€)-strong extractors whenever i < m — 
21og(i) +2. 
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Proof. Let Wi = {W \ I = i). Then 



STi{{Hx{W),XJ) , {U,,XJ)) 



¥.,[STi{{Hx{Wi),X) , {U,,X))] 



< 



< 




lV2-Hoo(H'|/)2^ 
2 



In the above derivation, the first inequality follows from the standard Leftover Hash Lemma (Lemma ISTI ). 



3 New Definitions 

3.1 Secure Sketches 

Let be a metric space with distance function dis. 

Definition 3. An {M,m,m,t)-secure sketch is a pair of randomized procedures, "sketch" (SS) and "re- 
cover" (Rec), with the following properties: 

1. The sketching procedure SS on input w G returns a bit string s G {0, 1}*. 

2. The recovery procedure Rec takes an element w' G M and a bit string s G {0, 1}*. The correct- 
ness property of secure sketches guarantees that if d\s{w,w') < t, then Rec{w' ,SS{w)) = w. If 
d\5{w,w') > t, then no guarantee is provided about the output of Rec. 

3. The security property guarantees that for any distribution W over M with min-entropy m, the value 
of W can be recovered by the adversary who observes s with probability no greater than 2"™. That 



A secure sketch is efficient if SS and Rec run in expected polynomial time. 

Average-Case Secure Sketches. In many situations, it may well be that the adversary's information i 
about the password w is probabilistic, so that sometimes i reveals a lot about w, but most of the time w stays 
hard to predict even given i. In this case, the previous definition of secure sketch is hard to apply: it provides 
no guarantee if Hoo(W^|i) is not fixed to at least m for some bad (but infrequent) values of i. A more robust 
definition would provide the same guarantee for all pairs of variables {W, I) such that predicting the value 
of W given the value of / is hard. We therefore define an average-case secure sketch as follows: 

Definition 4. An average-case (A^, m, m, t)-secure sketch is a secure sketch (as defined in Definition (3]) 
whose security property is strengthened as follows: for any random variables W over M. and / over {0, 1}* 
such that Hoo(W^ \ I) >m, we have Hoo(VF | (SS(VF), /)) >rh. Note that an average-case secure sketch 
is also a secure sketch (take / to be empty). 

This definition has the advantage that it composes naturally, as shown in Lemma 14.71 All of our con- 
structions will in fact be average-case secure sketches. However, we will often omit the term "average-case" 
for simplicity of exposition. 



and the second inequality follows from Jensen's inequality (namely, E ^/Z < ^E [Z]). 



□ 



is, Hoo(H^ I SS{W)) > rh. 
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Entropy Loss. The quantity rh is called the residual (min-)entropy of the secure sketch, and the quantity 
A = m — 771 is called the entropy loss of a secure sketch. In analyzing the security of our secure sketch 
constructions below, we will typically bound the entropy loss regardless of m, thus obtaining families of 



secure sketches that work for all m (in general, | Rey07 1 shows that the entropy loss of a secure sketch is 



upperbounded by its entropy loss on the uniform distribution of inputs). Specifically, for a given construction 
of SS, Rec and a given value t, we will get a value A for the entropy loss, such that, for any m, (SS, Rec) is 
an {A4,m, m — A, t)-secure sketch. In fact, the most common way to obtain such secure sketches would be 
to bound the entropy loss by the length of the secure sketch SS(t(;), as given in the following simple lemma: 

Lemma 3.1. Assume some algorithms SS and Rec satisfy the correctness property of a secure sketch for 
some value of t, and that the output range of SS has size at most 2^ ( this holds, in particular, if the length 
of the sketch is bounded by X). Then, for any min-entropy threshold m, (SS, Rec) form an average-case 
{M,m, m — X, t)-secure sketch for M. In particular, for any m, the entropy loss of this construction is at 
most X. 

Proof. The result follows immediately from Lemma IX2l b). since SS(VF) has at most 2^ values: for any 
(VF,/),Hoo(VF I (SS(VF),/)) > Hoo(W^ U) - A. □ 

The above observation formalizes the intuition that a good secure sketch should be as short as possible. 
In particular, a short secure sketch will likely result in a better entropy loss. More discussion about this 
relation can be found in Section |9l 



3.2 Fuzzy Extractors 

Definition 5. An {M.,m, t, e)-fuzzy extractor is a pair of randomized procedures, "generate" (Gen) and 
"reproduce" (Rep), with the following properties: 

1. The generation procedure Gen on input w ^ J\A outputs an extracted string R G {0, 1}^ and a helper 
string Pg {0,1}*. 

2. The reproduction procedure Rep takes an element w' ^ M and a bit string P G {0, 1}* as inputs. The 
correctness property of fuzzy extractors guarantees that if dis(?i;, w') < t and R, P were generated by 
{R, P) <— Gen(u'), then Rep(u;', P) = R. If dis(i(;, w') > t, then no guarantee is provided about the 
output of Rep. 

3. The security property guarantees that for any distribution on of min-entropy m, the string R is 
nearly uniform even for those who observe P: if {R, P) ^ Gen(H^), then SD {{R, P), {Ui, P)) < e. 

A fuzzy extractor is efficient if Gen and Rep run in expected polynomial time. 

In other words, fuzzy extractors allow one to extract some randomness R from w and then successfully 
reproduce R from any string w' that is close to w. The reproduction uses the helper string P produced during 
the initial extraction; yet P need not remain secret, because R looks truly random even given P. To justify 
our terminology, notice that strong extractors (as defined in Section |2]) can indeed be seen as "nonfuzzy" 
analogs of fuzzy extractors, coiTcsponding to t = 0, P = X , and Ai = {0, 1}". 

We reiterate that the nearly uniform random bits output by a fuzzy extractor can be used in any cryp- 
tographic context that requires uniform random bits (e.g., for secret keys). The slight nonuniformity of the 
bits may decrease security, but by no more than their distance e from uniform. By choosing e negligibly 
small (e.g., 2"^*^ should be enough in practice), one can make the decrease in security inelevant. 
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Similarly to secure sketches, the quantity m — ^ is called the entropy loss of a fuzzy extractor. Also 
similarly, a more robust definition is that of an average-case fuzzy extractor, which requires that if Hqo {W \ 
I) > m, then SD {{R, P, I), {Ue, P, I)) < e for any auxiliary random variable /. 



4 Metric-Independent Results 

In this section we demonstrate some general results that do not depend on specific metric spaces. They will 
be helpful in obtaining specific results for particular metric spaces below. In addition to the results in this 
section, some generic combinatorial lower bounds on secure sketches and fuzzy extractors are contained 
in Appendix O We will later use these bounds to show the near-optimality of some of our constructions for 
the case of uniform inputsjZl 



4.1 Construction of Fuzzy Extractors from Secure Sketches 



Not surprisingly, secure sketches are quite useful in constructing fuzzy extractors. Specifically, we construct 
fuzzy extractors from secure sketches and strong extractors as follows: apply SS to w to obtain s, and a 
strong extractor Ext with randomness x to to obtain R. Store {s, x) as the helper string P. To reproduce 
R from w' and P = (s, x), first use Rec(w;', s) to recover w and then Ext{w, x) to get R. 







r ► 


SS 


1 ► 

w 

1 ► 


Ext 


•-x ► 



-►x 



P 



i-X 

s- 



w- 



Rec 



w 



Ext 



A few details need to be filled in. First, in order to apply Ext to w, we will assume that one can represent 
elements of M. using n bits. Second, since after leaking the secure sketch value s, the password w has 
only conditional min-entropy, technically we need to use the average-case strong extractor, as defined in 
Definition |2] The formal statement is given below. 

Lemma 4.1 (Fuzzy Extractors from Sketches). Assume (SS, Rec) is an {A4,m, m, t)-secure sketch, and let 
Ext be an average-case (n,rh,i,€) -strong extractor. Then the following (Gen, Rep) is an {Ai,m,i,t,€)- 
fuzzy extractor: 

• Ger\{w; r, x): set P = {SS{w; r),x), R = Ext{w; x), and output (R, P). 

• Rep(ti;', (s, x)): recover w = Rec{'w' , s) and output R = Ext{w; x). 

Proof. From the definition of secure sketch (Definition[3l), we know that Hoo(VF | SS(VF)) > m. And since 
Ext is an average-case (n, m, e)-strong extractor, SD {{Ext{W; X),SS(W), X), (U£,SS{W), X)) = 
SU{{R,P),{Ui,P))<e. □ 



On the other hand, if one would Uke to use a worst-case strong extractor, we can apply Lemma 1231 to 

get 

Corollary 4.2. If (SS, Rec) is an {A4,m, rh, t)-secure sketch and Ext is an (n, m — log (|) , e)-strong 
extractor, then the above construction (Gen, Rep) is a {Ai,m, i,t,e + 6)-fuzzy extractor 



Although we beUeve our constructions to be near optimal for nonuniform inputs as well, and our combinatorial bounds in 
Appendixlclare also meaningful for such inputs, at this time we can use these bounds effectively only for uniform inputs. 
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Both Lemma |4TT] and Corollai'v l4.2l hold (with the same proofs) for building average-case fuzzy extrac- 
tors from average-case secure sketches. 

While the above statements work for general extractors, for our purposes we can simply use univer- 
sal hashing, since it is an average-case strong extractor that achieves the optimal BRTSOOII entropy loss of 
2 log (i) . In particular, using Lemma 1241 we obtain our main corollary: 

Lemma 4.3. If (SS, Rec) is an [M ,m,7h, t)-secure sketch and Ext is an {n,m,l, e)-strong extractor given 
by universal hashing ( in particular, any £ < m — 2 log ( ^) + 2 can be achieved), then the above construction 
(Gen, Rep) is an {Ai,m,i,t,e)-fuzzy extractor. In particular, one can extract up to {m — 2 log Q) + 2) 
nearly uniform bits from a secure sketch with residual min-entropy rh. 

Again, if the above secure sketch is average-case secure, then so is the resulting fuzzy extractor. In 
fact, combining the above result with Lemma lXT] we get the following general construction of average-case 
fuzzy extractors: 

Lemma 4.4. Assume some algorithms SS and Rec satisfy the correctness property of a secure sketch for 
some value oft, and that the output range of SS has size at most 2^ (this holds, in particular, if the 
length of the sketch is bounded by X). Then, for any min-entropy threshold m, there exists an average- 
case {A4,m, m — X — 2 log (-) + 2, t, e)-fuzzy extractor for A4. In particular, for any m, the entropy loss 
of the fuzzy extractor is at most A + 2 log (i) — 2. 

4.2 Secure Sketches for Transitive Metric Spaces 

We give a general technique for building secure sketches in transitive metric spaces, which we now define. A 
permutation vr on a metric space M is an isometry if it preserves distances, i.e., dis(a, b) = dis(7r(a), 7r(6)). 
A family of permutations 11 = {vTij^gj acts transitively on Ai if for any two elements a,h ^ M., there 
exists TTj G n such that vTj (a) = b. Suppose we have a family 11 of transitive isometrics for M. (we will 
call such J\A transitive). For example, in the Hamming space, the set of all shifts T^xiw) = w ® xis, such a 
family (see Section|5]for more details on this example). 

Construction 1 (Secure Sketch For Transitive Metric Spaces). Let C be an (A^, i^, t)-code. Then the 
general sketching scheme SS is the following: given an input w ^ M., pick uniformly at random a codeword 
b £ C, pick uniformly at random a permutation vr G 11 such that 7:{w) = b, and output SS(tt;) = vr (it is 
crucial that each vr G n should have a canonical description that is independent of how vr was chosen and, in 
particular, independent of b and w; the number of possible outputs of SS should thus be |n|). The recovery 
procedure Rec to find w given w' and the sketch vr is as follows: find the closest codeword b' to tt^w'), and 

output TT^^ib'). 

Let r be the number of elements vr G 11 such that min^^b \ {t^\t^{w) = b}\ > T. I.e., for each w and b, 
there are at least T choices for vr. Then we obtain the following lemma. 

Lemma 4.5. (SS, Rec) is an average-case {Ai,m,m — log|n| + logT + log K,t)-secure sketch. It is 
efficient if operations on the code, as well as it and vr^^, can be implemented efficiently. 

Proof. Correctness is clear: when d\5{w,w') < t, then d\s{b,7r{w')) < t, so decoding tt{w') will result 
in b' = b, which in turn means that 7r~^{b') = w. The intuitive argument for security is as follows: 
we add logK + logF bits of entropy by choosing b and vr, and subtract log |n| by publishing vr. Since 
given vr, w and b determine each other, the total entropy loss is log |n| — logK — logF. More formally. 
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Hoo(W I SS{W),I) = Hoo((W,SS(W^)) I /) - log|n| byLemmaHab). Given a particular value of u;, 
there are K equiprobable choices for b and, further, at least F equiprobable choices for vr once b is picked, 
and hence any given permutation vr is chosen with probability at most l/{Kr) (because different choices 
for b result in different choices for vr). Therefore, for all i, w, and vr, Pr[l^ = w A SS{w) = tt | / = i] < 
Ft[W = w I / = z]/(Kr); hence Hoo((VF,SS(VF)) \I)>Uoo{W\ /) + log K + log T. □ 

Naturally, security loss will be smaller if the code C is denser. 

We will discuss concrete instantiations of this approach in Section [5] and Section [6^1 

4.3 Changing Metric Spaces via Biometric Embeddings 

We now introduce a general technique that allows one to build fuzzy extractors and secure sketches in some 
metric space Mi from fuzzy extractors and secure sketches in some other metric space A^2- Below, we let 
dis(-, •),• denote the distance function in A4i. The technique is to embed M.i into M.2 so as to "preserve" 
relevant parameters for fuzzy extraction. 

Definition 6. A function f : Mi ^ M2 is called a (ti, t2, mi, 7712) -biometric embedding if the following 
two conditions hold: 

• for any wi , w']^ G A^i such that d is li;'^)-^ < ii, we have dis(/(tt;i), /(?i;2))2 < ^2- 

• for any distribution Wi on Aii of min-entropy at least mi, f{Wi) has min-entropy at least 1712- 

The following lemma is immediate (correctness of the resulting fuzzy extractor follows from the first con- 
dition, and security follows from the second): 

Lemma 4.6. If f is a {ti, t2, mi, m2)-biometric embedding of Mi into M2 and (Gen(-), Rep(-, •)) is an 

{M2,Tn2,i,t2,e) -fuzzy extractor, then (Gen(/(-)), Rep(/(-), •)) is an {Mi, mi, £,ti,e)-fuzzy extractor 

It is easy to define average-case biometric embeddings (in which Hoo(W^i I I) > n^i ^ Hoo(/(VFi) | 
/) > 7712), which would result in an analogous lemma for average-case fuzzy extractors. 

For a similar result to hold for secure sketches, we need biometric embeddings with an additional prop- 
erty. 

Definition 7. A function / : Mi — > M2 is called a (ii, t2, \)-biometric embedding with recovery informa- 
tion g if: 

• for any wi,'w'i G Mi such that d\s{wi,w'i)-y < ti, we have dis(/(wi), /(w2))2 < ^2- 

• g : Ml —>■ {0, 1}* is a function with range size at most 2'*', and wi G Mi is uniquely determined by 

{f{wi),g{wi)). 

With this definition, we get the following analog of Lemma 1431 

Lemma 4.7. Let f be a {ti,t2, A) biometric embedding with recovery information g. Let (SS, Rec) be an 

{M2, nil — X, 1712, t2) average-case secure sketch. LetSS'{w) = (SS(/(w)), ^(u;)). Let Rec' {w' , (s , r)) 
be the function obtained by computing Rec(?i;', s) to get f{w) and then inverting {f{w), r) to get w. Then 
(SS', Rec') is an {Mi,mi,rh2, ii) average-case secure sketch. 
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Proof. The correctness of this construction follows immediately from the two properties given in Defi- 
nition |7] As for security, using Lemma l2!2l b) and the fact that the range of g has size at most 2'^, we 
get that Hoo(VF | g{W)) > mi — A whenever Hoo(VF) > nii. Moreover, since W is uniquely re- 
coverable from f{W) and g{W), it follows that Hoo(/(Vr) | g{W)) > mi — A as well, whenever 
Hoo(VF) > nil. Using the fact that (SS, Rec) is an average-case {M.2,'mi — A, 7712,^2) secure sketch, 
we get that Hoc (/(VF) | {SS{W) , g{W))) = Hoo(/(^) | SS'{W)) > m2. Finally, since the application 
of / can only reduce min-entropy, Hoo(W^ | SS'(VF)) > 1712 whenever Hoo(VF) > mi. □ 

As we saw, the proof above critically used the notion of average-case secure sketches. Luckily, all our 
constructions (for example, those obtained via Lemma [3?T]) are average-case, so this subtlety will not matter 
too much. 

We will see the utility of this novel type of embedding in Section |7] 



5 Constructions for Hamming Distance 

In this section we consider constructions for the space M. = T"^ under the Hamming distance metric. Let 
F = 1.^1 and/ = log2F. 

Secure Sketches: The Code-Offset Construction. For the case of = {0, 1}, Juels and Wat- 
tenberg IIJW99I considered a notion of "fuzzy commitment." Given an [n, /c,2t + 1]2 error-correcting 
code C (not necessarily linear), they fuzzy-commit to x by publishing w®C[x). Their construction can be 
rephrased in our language to give a very simple construction of secure sketches for general T . 

We start with an [n, fc, 2t + 1]jp error-correcting code C (not necessarily lineai^). The idea is to use C 
to correct errors in w even though w may not be in C. This is accomplished by shifting the code so that a 
codeword matches up with w, and storing the shift as the sketch. To do so, we need to view T as an additive 
cyclic group of order F (in the case of most common error-correcting codes, T will anyway be a field). 

Construction 2 (Code-Offset Construction). On input w, select a random codeword c (this is equivalent to 
choosing a random x G and computing C{x)), and set SS(u;) to be the shift needed to get from c to 
w: SS(u;) = w — c. Then Rec(t«', s) is computed by subtracting the shift s from w' to get c' = w' — s; 
decoding c' to get c (note that because dis(t(;', w) < t, so is dis(c', c)); and computing w by shifting back to 

get w = c + s. 

^^s__.^-*w 

In the case of JF = {0, 1}, addition and subtraction are the same, and we get that computation of the 
sketch is the same as the Juels-Wattenberg commitment: SS(w;) = w (B C{x). In this case, to recover w 
given w' and s = SS{w), compute d = w' ® s, decode c' to get c, and compute w = c(B s. 

When the code C is linear, this scheme can be simplified as follows. 

Construction 3 (Syndrome Construction). Set SS(t(;) = syn{w). To compute Rec{w', s), find the unique 
vector e G JF" of Hamming weight < t such that syn(e) = syn{w') — s, and output w = w' — e. 

As explained in Section |2l finding the short eiTor- vector e from its syndrome is the same as decoding 
the code. It is easy to see that two constructions above are equivalent: given syn{w) one can sample from 

^In their interpretation, one commits to x by picking a random w and publisiiing SS(?ii; x). 
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w — cby choosing a random string v with syn(f ) = 5yn{w); conversely, syn{w — c) = syn{w). To show 
that Rec finds the correct w, observe that dis(ii;' — e,w') < t by the constraint on the weight of e, and 
syn(w' — e) = syn(w') — syn(e) = syn(w') — (syn(w') — s) = s. There can be only one value within 
distance t of w' whose syndrome is s (else by subtracting two such values we get a codeword that is closer 
than 2t + 1 to 0, but is also a codeword), so w' — e must be equal to w. 

As mentioned in the introduction, the syndrome construction has appeared before as a component of 
some cryptographic protocols over quantum and other noisy channels ||BBCS91[|Cre97i . though it has not 
been analyzed the same way. 

Both schemes are (^", m,m — {n — k)f, t) secure sketches. For the randomized scheme, the intuition 
for understanding the entropy loss is as follows: we add k random elements of and publish n elements of 
jr. The formal proof is simply Lemma 1431 because addition in JT" is a family of transitive isometrics. For 
the syndrome scheme, this follows from Lemma [3?T1 because the syndrome is (n — k) elements of !F. 

We thus obtain the following theorem. 

Theorem 5.1. Given an [n, k, 2t + error-correcting code, one can construct an average-case {T^, m, 
m — {n — k)f, t) secure sketch, which is efficient if encoding and decoding are efficient. Furthermore, if the 
code is linear, then the sketch is deterministic and its output is (n — k) symbols long. 

In Appendix O we present some generic lower bounds on secure sketches and fuzzy extractors. Recall 
that Apin^d) denotes the maximum number K of codewords possible in a code of distance d over n- 
character words from an alphabet of size F. Then by Lemma lCTl we obtain that the entropy loss of a secure 
sketch for the Hamming metric is at least nf — \og2 Ap{n,2t -\- 1) when the input is uniform (that is, when 
m = nf), because K{M.,t) from Lemma ICT] is in this case equal to Ap{n,2t + 1) (since a code that 
corrects t Hamming errors must have minimum distance at least 2t + 1). This means that if the underlying 
code is optimal (i.e., K = Ap{n, 2t + 1)), then the code-offset construction above is optimal for the case of 
uniform inputs, because its entropy loss is nf — \ogp K log2 F = nf — log2 K. Of course, we do not know 
the exact value of Ap{n, d), let alone efficiently decodable codes which meet the bound, for many settings 
of F, n and d. Nonetheless, the code-offset scheme gets as close to optimality as is possible from coding 
constraints. If better efficient codes are invented, then better (i.e., lower loss or higher error-tolerance) secure 
sketches will result. 

Fuzzy Extractors. As a warm-up, consider the case when W is uniform (m = n) and look at the code- 
offset sketch construction: v = w — C{x). For Gen(«;), output R = x, P = v. For Rep{w',P), decode 
w' — P to obtain C{x) and apply to obtain x. The result, quite clearly, is an (JF", nf, kf, t, 0)-fuzzy 
extractor, since v is truly random and independent of x when w is random. In fact, this is exactly the usage 
proposed by Juels and Wattenberg IIJW99II . except they viewed the above fuzzy extractor as a way to use w 
to "fuzzy commit" to x, without reveahng information about x. 

Unfortunately, the above construction setting R = x works only for uniform W, since otherwise v 
would leak information about x. 

In general, we use the construction in Lemma 1431 combined with Theorem 15. II to obtain the following 
theorem. 

Theorem 5.2. Given any [n, /c, 2t + code C and any m, e, there exists an average-case (Al, m, t, e)- 
fuzzy extractor, where i = m-\-kf — nf — 2 log (i) +2. The generation Gen and recovery Rep are efficient 
if C has efficient encoding and decoding. 
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6 Constructions for Set Difference 



We now turn to inputs that are subsets of a universe U; let n = \U\. This coiTesponds to representing an 
object by a list of its features. Examples include "minutiae" (ridge meetings and endings) in a fingerprint, 
short strings which occur in a long document, or lists of favorite movies. 

Recall that the distance between two sets w' is the size of their symmetric difference: dis(u;, w') = 
\wAw'\. We will denote this metric space by SDif(Z//). A set w can be viewed as its characteristic vector in 
{0, 1}", with 1 at position x G W if x G lu, and otherwise. Such representation of sets makes set difference 
the same as the Hamming metric. However, we will mostly focus on settings where n is much larger than 
the size of w, so that representing a set it; by n bits is much less efficient than, say, writing down a list of 
elements in w, which requires only \iu\ log n bits. 

Large Versus Small Universes. More specifically, we will distinguish two broad categories of 
settings. Let s denote the size of the sets that are given as inputs to the secure sketch (or fuzzy extractor) 
algorithms. Most of this section studies situations where the universe size n is supeipolynomial in the set 
size s. We call this the "large universe" setting. In contrast, the "small universe" setting refers to situations 
in which n = poly{s). We want our various constructions to run in polynomial time and use polynomial 
storage space. In the large universe setting, the n-bit string representation of a set becomes too large to be 
usable — we will strive for solutions that are polynomial in s and log n. 

In fact, in many applications — for example, when the input is a list of book titles — it is possible that the 
actual universe is not only large, but also difficult to enumerate, making it difficult to even find the position 
in the characteristic vector corresponding to x ^ w. In that case, it is natural to enlarge the universe to a 
well-understood class — for example, to include all possible strings of a certain length, whether or not they 
are actual book titles. This has the advantage that the position of x in the characteristic vector is simply x 
itself; however, because the universe is now even larger, the dependence of running time on n becomes even 
more important. 

Fixed versus Flexible Set Size. In some situations, all objects are represented by feature sets of 
exactly the same size s, while in others the sets may be of arbitrary size. In particular, the original set w 
and the corrupted set w' from which we would like to recover the original need not be of the same size. We 
refer to these two settings as fixed and flexible set size, respectively. When the set size is fixed, the distance 
dis(ti;, w') is always even: dis(tt;, w') = t if and only if w and w' agree on exactly s — | points. We will 
denote the restriction of SDif(Z//) to s-element subsets by SD\i s{l{). 

Summary. As a point of reference, we will see below that log (") — log A{n, 2t + 1, s) is a lower bound 
on the entropy loss of any secure sketch for set difference (whether or not the set size is fixed). Recall that 
A{n, 2t + 1, s) represents the size of the largest code for Hamming space with minimum distance 2t + 1, 
in which every word has weight exactly s. In the large universe setting, where t <ti n, the lower bound is 
approximately t log n. The relevant lower bounds are discussed at the end of Sections |6?T] and [6!2l 

In the following sections we will present several schemes which meet this lower bound. The setting of 
small universes is discussed in Section 16.11 We discuss the code-offset construction (from Section |5]l, as 
well as a permutation-based scheme which is tailored to fixed set size. The latter scheme is optimal for this 
metric, but impractical. 

In the remainder of the section, we discuss schemes for the large universe setting. In Section 16.21 we 
give an improved version of the scheme of Juels and Sudan IIJS06II . Our version achieves optimal entropy 
loss and storage t log n for fixed set size (notice the entropy loss doesn't depend on the set size s, although 
the running time does). The new scheme provides an exponential improvement over the original parameters 
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Entropy Loss 


Storage 


Time 


Set Size 


Notes 


Juels-Sudan 
|JS06l 


tiogn + iog(e)/(';::))+2 


r log n 


poly{r login)) 


Fixed 


r is a parameter 

s < r < n 


Generic 
syndrome 


n-logA{n,2t + l) 


n - log^(?i, 2t + 1) 
(for linear codes) 


poly{n) 


Flexible 


ent. loss « t log(n) 
when t <^ n 


Permutation- 
based 


log □ -logA{n,2t + l,s) 


0{n log 7i) 


poly{n) 


Fixed 


ent. loss ~ t log n 
when t <^ n 


Improved 
JS 


tlogn 


t log n 


poly{s log n) 


Fixed 




PinSketch 


tlog(n + 1) 


flog(n + l) 


poly{s log n) 


Flexible 


See Section 16. 31 
for running time 



Table 1: Summary of Secure Sketches for Set Difference. 



(which are analyzed in Appendix IdI). Finally, in Section [63] we describe how to adapt syndrome decoding 
algorithms for BCH codes to our application. The resulting scheme, called PinSketch, has optimal storage 
and entropy loss t log(n + 1), handles flexible set sizes, and is probably the most practical of the schemes 
presented here. Another scheme achieving similar parameters (but less efficiently) can be adapted from 
information reconciliation literature IIMTZ03II : see Section |9]for more details. 

We do not discuss fuzzy extractors beyond mentioning here that each secure sketch presented in this 
section can be converted to a fuzzy extractor using Lemma 1431 We have already seen an example of such 
conversion in Section |5] 

Table [T] summarizes the constructions discussed in this section. 

6.1 Small Universes 

When the universe size is polynomial in s, there are a number of natural constructions. The most direct one, 
given previous work, is the construction of Juels and Sudan IIJS06i Unfortunately, that scheme requires a 
fixed set size and achieves relatively poor parameters (see Appendix iDl). 

We suggest two possible constructions. The first involves representing sets as n-bit strings and using the 
constructions of Section [5] The second construction, presented below, requires a fixed set size but achieves 
slightly improved parameters by going through "constant-weight" codes. 

Permutation-based Sketch. Recall the general construction of Section l42] for transitive metric spaces. 
Let n be a set of all permutations on U. Given vr G 11, make it a permutation on SD\fs{l^) naturally: 
7r(t(;) = {7r(2;)|2; G w}. This makes 11 into a family of transitive isometrics on SDifs(^), and thus the 
results of Section l4!2] appl v. 

Let C C {0, 1}" be any [n, k, 2t + 1] binary code in which all words have weight exactly s. Such 
codes have been studied extensively (see, e.g., H AVZOOl [B S S S 901 for a summary of known upper and lower 
bounds). View elements of the code as sets of size s. We obtain the following scheme, which produces a 
sketch of length 0(n log n). 

Construction 4 (Permutation-Based Sketch). On input w C U of size s, choose 6 C ^ at random from 
the code C, and choose a random permutation tt : U ^ U such that ■k{w) = b (that is, choose a random 
matching between w and b and a random matching between U — w and U — b). Output SS(u;) = vr (say, 
by listing 7r(l), . . . , 7r(n)). To recover w from w' such that dis(tt;, w') < t and vr, compute b' = 7r~^(w'), 
decode the characteristic vector of b' to obtain b, and output w = TT{b). 
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This construction is efficient as long as decoding is efficient (everytliing else takes time O(nlogn)). 
By Lemma 1431 its entropy loss is log (") — k: here |n| = n! and T = s!(n — s)l, so log |n| — logF = 
logn!/(s!(n — s)\). 

Comparing the Hamming Scheme with the Permutation Scheme. The code-offset construction 
was shown to have entropy loss n — log A{n, 2t + 1) if an optimal code is used; the random permutation 
scheme has entropy loss log (^) — log A{n, 2t + 1, s) for an optimal code. The Bassalygo-Elias inequality 
(see llvL92|| ) shows that the bound on the random permutation scheme is always at least as good as the 
bound on the code offset scheme: A{n, d) • 2^" < A{n, d, s) • (") ^ . This implies that n — log A{n, d) > 
log (") — log A{n, d, s). Moreover, standard packing arguments give better constructions of constant-weight 
codes than they do of ordinary codes. |^ In fact, the random permutations scheme is optimal for this metric, 
just as the code-offset scheme is optimal for the Hamming metric. 

We show this as follows. Restrict t to be even, because d\s{w,w') is always even if \w\ = \w'\. Then 
the minimum distance of a code over SDifs(Z^) that corrects up to t errors must be at least 2t + l.Indeed, 
suppose not. Then take two codewords, ci and C2 such that dis(ci, C2) < 2t. There are k elements in ci that 
are not in C2 (call their set ci — C2) and k elements in C2 that are not in ci (call their set C2 — ci), with k < t. 
Starting with ci, remove t/2 elements of ci — C2 and add t/2 elements of C2 — ci to obtain a set w (note that 
here we are using that t is even; if < t/2, then use k elements). Then dis(ci, it)) < t and d\s{c2,w) < t, 
and so if the received word is w, the receiver cannot be certain whether the sent word was ci or C2 and hence 
cannot correct t errors. 

Therefore by Lemma IC.ll we get that the entropy loss of a secure sketch must be at least log (") — 
log A{n, 2t + l, s) in the case of a uniform input w. Thus in principle, it is better to use the random permuta- 
tion scheme. Nonetheless, there are caveats. First, we do not know of explicitly constructed constant-weight 
codes that beat the Elias-Bassalygo inequality and would thus lead to better entropy loss for the random 
permutation scheme than for the Hamming scheme (see IIBSSS90II for more on constructions of constant- 
weight codes and MAVZOOI for upper bounds). Second, much more is known about efficient implementation 
of decoding for ordinary codes than for constant- weight codes; for example, one can find off-the-shelf hard- 
ware and software for decoding many binary codes. In practice, the Hamming-based scheme is likely to be 
more useful. 

6.2 Improving the Construction of Juels and Sudan 

We now turn to the large universe setting, where n is supeipolynomial in the set size s, and we would like 
operations to be polynomial in s and log n. 

Juels and Sudan IIJS06i proposed a secure sketch for the set difference metric with fixed set size (called 
a "fuzzy vault" in that paper). We present their original scheme here with an analysis of the entropy loss in 
Appendix |D] In particular, our analysis shows that the original scheme has good entropy loss only when the 
storage space is very large. 

We suggest an improved version of the Juels-Sudan scheme which is simpler and achieves much better 
parameters. The entropy loss and storage space of the new scheme are both t log n, which is optimal. (The 
same parameters are also achieved by the BCH-based construction PinSketch in Section [631 ) Our scheme 
has the advantage of being even simpler to analyze, and the computations are simpler. As with the original 
Juels-Sudan scheme, we assume 7i = \U\ is a prime power and work over JT = GF{n). 

'This comes from the fact that the intersection of a ball of radius d with the set of all words of weight s is much smaller than 
the ball of radius d itself. 
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An intuition for the scheme is that the numbers i/s+i, • • • , Vr from the JS scheme need not be chosen at 
random. One can instead evaluate them as = p'{xi) for some polynomial p'. One can then represent the 
entire list of pairs {xi,yi) implicitly, using only a few of the coefficients of p'. The new sketch is determinis- 
tic (this was not the case for our preliminary version in IIDRS04II ). Its implementation is available [,HJR06J . 

Construction 5 (Improved JS Secure Sketch for Sets of Size s). 
To compute SS (?/;): 

1. Let p'O be the unique monic polynomial of degree exactly s such that p'{x) = for all x £ w. 
(That is, l&tp'{z) = ll^^Jz-x).) 

2. Output the coefficients of p'{) of degree s — 1 down to s — t. 

This is equivalent to computing and outputting the first t symmetric polynomials of the values in A; 
i.e., if w = {xi, . . . ,Xs}, then output 



' ^ xiXj, . . . , ^ I n ^* 1 • 

if^j 5Cfsl,|S|=t Vies / 



if^j SC[s],\S\-- 

To compute Rec{w',p'), where w' = {oi, 02, . . . , as}, 

1. Create a new polynomial phigh> of degree s which shares the top t + I coefficients of p'; that is, let 

2. Evaluate phigh on all points in w' to obtain s pairs (aj, bi). 

3. Use [s, s — t,t + l\u Reed-Solomon decoding (see, e.g., IIBla83[|vL92|| ) to search for a polynomial 
Plow of degree s — t — 1 such that piow(ai) = h for at least s — t/2 of the aj values. If no such 
polynomial exists, then stop and output "fail." 

4. Output the list of zeroes (roots) of the polynomial phigh — Plow (see, e.g., MShoOSII for root-finding 
algorithms; they can be sped up by first factoring out the known roots — namely, (z — ai) for the s — t/2 
values of Cj that were not deemed eiToneous in the previous step). 



To see that this secure sketch can tolerate t set difference errors, suppose dis(w, w') < t. Let p' be as in 
the sketch algorithm; that is, p'{z) = Yix&wi^ ~ polynomial p' is monic; that is, its leading term is 

z**. We can divide the remaining coefficients into two groups: the high coefficients, denoted a^.t, . . . , a^-i, 
and the low coefficients, denoted 61, ... , bs-t-i- 

p'iz) = z' + Yl + 




Phigh(^) 

We can write p' as Phigh + <?> where q has degree s — t — I. The recovery algorithm gets the coefficients of 
Phigh as input. For any point x in w, we have = p'{x) = Phighl^;) + Thus, Phigh and —q agree at all 
points in w. Since the set w intersects w' in at least s — t/2 points, the polynomial —q satisfies the conditions 
of Step[3]in Rec. That polynomial is unique, since no two distinct polynomials of degree s — t—1 can get the 
correct hi on more than s — t/2 a^s (else, they agree on at least s — t points, which is impossible). Therefore, 
the recovered polynomial piow must be —q; hence Phigh(2;) — Plow (2^) = p'{x)- Thus, Rec computes the 
correct p' and therefore correctly finds the set w, which consists of the roots of p'. 

Since the output of SS is t field elements, the entropy loss of the scheme is at most t log n by Lemma l3?T] 
(We will see below that this bound is tight, since any sketch must lose at least tlogn in some situations.) 
We have proved: 
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Theorem 6.1 (Analysis of Improved JS). Construction\5\is an average-case {SD\fs{l^),m, m — t log n, t) 
secure sketch. The entropy loss and storage of the scheme are at most t log n, and both the sketch generation 
SS() and the recovery procedure Rec() run in time polynomial in s, t and log n. 



Lower Bounds for Fixed Set Size in a Large Universe. The short length of the sketch makes this 
scheme feasible for essentially any ratio of set size to universe size (we only need log n to be polynomial in 
s). Moreover, for large universes the entropy loss t log n is essentially optimal for uniform inputs (i.e., when 
m = log (")). We show this as follows. As already mentioned in the Section lOl Lemma ICT] shows that 
for a uniformly distributed input, the best possible entropy loss is m — m' > log □ -logA{n,2t + l,s). 

By Theorem 12 of Agrell et al. HAVZOOL A{n,2t + 2,s) < f^. Noting that A{n,2t + l,s) = 

\s-t) 

A{n, 2t + 2, s) because distances in SDifs(Z//) ai^e even, the entropy loss is at least 

m-m'> log Q - log Ain,2t + l,s)> log (;^) " = " ^ ') ' 

When n ^ s, this last quantity is roughly t log n, as desired. 



6.3 Large Universes via the Hamming Metric: Sublinear-Time Decoding 

In this section, we show that the syndrome construction of Section[5]can in fact be adapted for small sets in 
a large universe, using specific properties of algebraic codes. We will show that BCH codes, which contain 
Hamming and Reed-Solomon codes as special cases, have these properties. As opposed to the constructions 
of the previous section, the construction of this section is flexible and can accept input sets of any size. 

Thus we obtain a sketch for sets of flexible size, with entropy loss and storage tlog(n + 1). We will 
assume that n is one less than a power of 2: n = 2™ — 1 for some integer m, and will identify U with the 
nonzero elements of the binary finite field of degree m:U = GF{2^)*. 

Syndrome Manipulation for Small-Weight Words. Suppose now that we have a small set 
w ly( of size s, where n ^ s. Let denote the characteristic vector of w (see the beginning of 
Section O. Then the syndrome construction says that SS(tt;) = syn(xuj). This is an (n — fc)-bit quantity. 
Note that the syndrome construction gives us no special advantage over the code-offset construction when 
the universe is small: storing the n-bit + C(r) for a random fc-bit r is not a problem. However, it's a 
substantial improvement when n ^ n — k. 

If we want to use syn(j;^) as the sketch of w, then we must choose a code with n — k very small. In 
particular, the entropy of w is at most log (") w s log n, and so the entropy loss n — k had better be at most 
slogn. Binary BCH codes ai^e suitable for our purposes: they are a family of [n, k, S\2 linear codes with 
5 = 2t + 1 and k = n — tm (assuming n = 2™ — 1) (see, e.g. |'vL92'|) These codes are optimal for t ^ n 
by the Hamming bound, which implies that k < n — log (") tvL92jj'i Using the syndrome sketch with a 
BCH code C, we get entropy loss n — k = t log(n + 1), essentially the same as the t log n of the improved 
Juels-Sudan scheme (recall that 6 >2t + 1 allows us to coiTcct t set difference errors). 

The only problem is that the scheme appears to require computation time Q{n), since we must compute 
syn(a;^) = ifx„, and, later, run a decoding algorithm to recover Xw For BCH codes, this difficulty can be 
overcome. A word of small weight w can be described by listing the positions on which it is nonzero. We 

'"The Hamming bound is based on the observation that for any code of distance S, the balls of radius [{5 — 1)/2J centered at 
various codewords must be disjoint. Each such ball contains (|^(^_")/2j) ^^'^ {i(s-i)/2i) — ■ Inourcasei = 2t + l, 

and so the bound yields k < n — log (") . 
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call this description the support of and write supp(xuj) (note that supp(2;^) = w; see the discussion of 
enlarging the universe appropriately at the beginning of Section O. 

The following lemma holds for general BCH codes (which include binary BCH codes and Reed-Solomon 
codes as special cases). We state it for binary codes since that is most relevant to the application: 

Lemma 6.2. For a [n, /c, (5] binary BCH code C one can compute: 

• syn(x), given supp(x), in time polynomial in 5, logn, and |supp(x)| 

• supp(x), given syn(a;) (when x has weight at most {6 — l)/2j, in time polynomial in 6 and log n. 

The proof of Lemma W2\ requires a careful reworking of the standard BCH decoding algorithm. The 
details ai^e presented in Appendix 10 For now, we present the resulting secure sketch for set difference. 

Construction 6 (PinSketch). 

To compute SS(i(;) = syn(xuj): 

1. Let Si = Ylx£w ^* (computations in GF{2"^)). 

2. Output SS{w) = (Si, S3, S5, . . . , S2t-l). 

To recover Rec{w' , (si,S3, ■ ■ ■ ,S2t-i)): 

1. Compute {s'l, Sg, . . . , S2t-i) = SS(^y') = 5yn{xy^'). 

2. Let fji = s'i - Si (in GF{T^), so "-" is the same as "+"). 

3. Compute supp(t;) such that syn(?;) = (iti, 0-3, . . . , cr2t-i) and |supp(?;)| < t by Lemma l6.2l 

4. If dis(w, w') < t, then supp(t>) = wAw'. Thus, output w = zi;'Asupp(u). 

An implementation of this construction, including the reworked BCH decoding algorithm, is available IIHJR06II . 

The bound on entropy loss is easy to see: the output is t log(n + 1) bits long, and hence the entropy loss 
is at most t log(n + 1) by Lemma [3TT] We obtain: 

Theorem 6.3. PinSketch is an average-case (SDif(i^), m, m— t log(n+l), t) secure sketch for set difference 
with storage t log(n +1). The algorithms SS and Rec both run in time polynomial in t and log n. 



7 Constructions for Edit Distance 

The space of interest in this section is the space T* for some alphabet J^, with distance between two strings 
defined as the number of character insertions and deletions needed to get from one string to the other. Denote 
this space by EditjF(n). Let F = \ J^\. 

First, note that applying the generic approach for transitive metric spaces (as with the Hamming space 
and the set difference space for small universe sizes) does not work here, because the edit metric is not 
known to be transitive. Instead, we consider embeddings of the edit metric on {0, 1}" into the Hamming or 
set difference metric of much larger dimension. We look at two types: standard low-distortion embeddings 
and "biometric" embeddings as defined in Section 1431 

For the binary edit distance space of dimension n, we obtain secure sketches and fuzzy extractors cor- 
recting t errors with entropy loss roughly tn°^^\ using a standai^d embedding, and 2.38v^tn logn, using a 
relaxed embedding. The first technique works better when t is small, say, 77,^"'^ for a constant 7 > 0. The 
second technique is better when t is large; it is meaningful roughly as long as t < "2 ■ 
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7.1 Low-Distortion Embeddings 

A (standard) embedding with distortion D is an injection V : -M-i ^ M.2 such that for any two points 
x, y G A^i, the ratio ^'^^^^^^'^^^^^ is at least 1 and at most D. 

When the preliminary version of this paper appeared IIDRS04II . no nontrivial embeddings were known 
mapping edit distance into £i or the Hamming metric (i.e., known embeddings had distortion 0{n)). Re- 
cently, Ostrovsky and Rabani IIQR05I gave an embedding of the edit metric over T = {0, 1} into £i with 
subpolynomial distortion. It is an injective, polynomial-time computable embedding, which can be inter- 
preted as mapping to the Hamming space {0, l}'^, where d = poly(n).0 

Fact 7.1 ( IIQR05i ). There is a polynomial-time computable embedding ^/^ed '■ Edit|o,l}(?^) ^ {0, l}P°^y(") 
with distortion D^,d{n) =^ 20{Viogniogiogn)_ 

We can compose this embedding with the fuzzy extractor constructions for the Hamming distance to 
obtain a fuzzy extractor for edit distance which will be good when t, the number of enws to be corrected, is 
quite small. Recall that instantiating the syndrome fuzzy extractor construction (Theorem 15.21) with a BCH 
code allows one to correct t' errors out of d at the cost of t' log d + 2 log (i) — 2 bits of entropy. 

Construction 7. For any length n and error threshold t, let V'cd be the embedding given by Fact 17. II from 
Edit|o^i}(n) into {0, 1}'' (where d = poly(n)), and let syn be the syndrome of a BCH code correcting 
t' = tDcd{n) errors in {0,1}'^. Let {Hx}xex be a family of universal hash functions from {0,1}'' to 
{0, 1}^ for some £. To compute Gen on input w G Editjg j} (n), pick a random x and output 

R = H^iiJediw)) ,P = {syn{tl;cd{w)),x) . 

To compute Rep on inputs w' and P = (s, x), compute y = Rec{ipcd{w'), s), where Rec is from Construc- 
tion [3l and output R = Hx{y)- 

Because -i/'cd is injective, a secure sketch can be constructed similarly: SS(^«) = syn{ilj{w)), and to 
recover w from w' and s, compute ?/;^^(Rec(V'ed(w^')))- However, it is not known to be efficient, because it 
is not known how to compute efficiently. 

Proposition 7.2. For any n,t,m, there is an average-case {Ed'\t^Q ij{n), 111,111' ,t)-secure sketch and an 
efficient average-case (Edit|o^i}(n), m, I, t, e)-fuzzy extractor where m' = m — t2'^^^^°^"'^°^ and I = 
m' — 2 log (i) +2. In particular, for any a < 1, there exists an efficient fuzzy extractor tolerating 11°' errors 
with entropy loss n"''^"^^'^ + 2 log (^). 

Proof. Construction |7] is the same as the construction of Theorem 15.21 (instantiated with a BCH-code-based 
syndrome construction) acting on ip^diw). Because iped is injective, the min-entropy of iped{w) is the 
same as the min-entropy m of w. The entropy loss in Construction [3] instantiated with BCH codes is 

t'logd = i2'^(Vlognloglogn) logpoly(n). Because 2^(Vlognloglogn) gj-Q^g f^gjgj. jj^^j^ i^^^^ ^^^^ jj^g 

same as i20(Viogniogiogn.) _ □ 

Note that the peculiar-looking distortion function from Fact 17. II increases more slowly than any polyno- 
mial in n, but still faster than any polynomial in log n. In sharp contrast, the best lower bound states that any 

"The embedding of IOR05I produces strings of integers in the space {1, . . . , 0(log n)}'^"'-*''"', equipped with £1 distance. One 
can convert this into the Hamming metric with only a logarithmic blowup in length by representing each integer in unary. 
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embedding of Edit|o_i}(n) into ii (and hence Hamming) must have distortion at least 0(logn/loglogn) 
IIAK07II . Closing the gap between the two bounds remains an open problem. 

General Alphabets. To extend the above construction to general T, we represent each character of 
.7^ as a string of log F bits. This is an embedding J^" into {0, l}"i°s^, which increases edit distance by a 
factor of at most log F. Then t' = t(log F)Dcd{n) and d = poly(n, log F). Using these quantities, we get 
the generalization of Proposition l7.2l for larger alphabets (again, by the same embedding) by changing the 
formula for m' to m' = m - t(log i?) Vi°s(" i°g i°g i°g(" i°g ^)) . 

7.2 Relaxed Embeddings for the Edit Metric 

In this section, we show that a relaxed notion of embedding, called a biometric embedding in Section 14.31 
can produce fuzzy extractors and secure sketches that are better than what one can get from the embedding 
of |[OR05,I when t is large (they are also much simpler algorithmically, which makes them more practical). 
We first discuss fuzzy extractors and later extend the technique to secure sketches. 

Fuzzy Extractors. Recall that unlike low-distortion embeddings, biometric embeddings do not care 
about relative distances, as long as points that were "close" (closer than ti) do not become "distant" (farther 
apart than t2). The only additional requirement of a biometric embedding is that it preserve some min- 
entropy: we do not want too many points to collide together. We now describe such an embedding from the 
edit distance to the set difference. 

A c-shingle is a length-c consecutive substring of a given string w. A c-shingling IIBro97i of a string 
w of length n is the set (ignoring order or repetition) of all (n — c + 1) c-shingles of w. (For instance, 
a 3-shingling of "abcdecdeah" is {abc, bed, cde, dec, ecd, dea, eah}.) Thus, the range of the c-shingling 
operation consists of all nonempty subsets of size at most n — c + 1 of .F^. Let SDif(J^^) stand for the set 
difference metric over subsets of J^'^ and SHc stand for the c-shingling map from Editjr(n) to SDif (.T^'^). We 
now show that SHc is a good biometric embedding. 

Lemma 7.3. For any c, SHc is an average-case (ti, t2 = (2c— mi, m2 = mi — [^] log2(n — c + 1))- 
biometric embedding o/EditjF(n) into SDif(.7^'^). 

Proof. Let w,w' G EditjF(n) be such that A\s{w,w') < ti and / be the sequence of at most ti inser- 
tions and deletions that transforms w into w'. It is easy to see that each character deletion or insertion 
adds at most (2c — 1) to the symmetric difference between SHc(tL') and SHc(tL''), which implies that 
dis(SHc(u;),SHc(u;')) < (2c- as needed. 

For w G JT", define gdw) as follows. Compute SHc(i«) and store the resulting shingles in lexicographic 
order hi . . . (k < n — c + I). Next, naturally partition w into \n/c\ c-shingles si . . . s^n/c] > disjoint 
except for (possibly) the last two, which overlap by c[n/c] — n characters. Next, for 1 < j < [n/c], set 
Pj to be the index i S {0 . . . k} such that sj = hi. In other words, pj tells the index of the jth disjoint 
shingle of w in the alphabetically ordered fc-set SHc(i«). Set gdw) = (pi, . . . ,P\n/c])- (For instance, 
33 ("abcdecdeah") = (1,5,4,6), representing the alphabetical order of "abc", "dec", "dea" and "eah" in 
SH3("abcdecdeah").) The number of possible values for gdw) is at most (n — c + 1)1^"^, and w can be 
completely recovered from SHc(iL') and gdw)- 

Now, assume W is any distribution of min-entropy at least nii on Ed\tjr[n). Applying Lemma l2!2l b). 
wegetHoo(VF | gdW)) > mi - ] log2(n - c + 1). Since Pr(VF = u; | gdW) = g) = Pr(SHc(iy) = 
SHc(if ) I gd^) = 9) (because given gdw), SHc(ui) uniquely determines w and vice versa), by applying 
the definition of Hoo, we obtain Hoo(SHc(VF)) > Hoo(SHc(W^) | gdW)) = iloo{W\ gdW)). The same 
proof holds for average min-entropy, conditioned on some auxiliary information I. □ 
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By Theorem 16.31 for universe T'^ of size F'^ and distance threshold t2 = (2c — we can construct 
a secure sketch for the set difference metric with entropy loss t2\log{F'^ + 1)] ([•] because Theorem 16.3 
requires the universe size to be one less than a power of 2). By Lemma 1431 we can obtain a fuzzy extractor 
from such a sketch, with additional entropy loss 2 log (i) — 2. Applying Lemma l43] to the above embedding 
and this fuzzy extractor, we obtain a fuzzy extractor for EditjF(n), any input entropy m, any distance t, and 
any security parameter e, with the following entropy loss: 



log2(n-c + l) + (2c-l)triog(F^ + l)l +21og( - 



(the first component of the entropy loss comes from the embedding, the second from the secure sketch for 
set difference, and the third from the extractor). The above sequence of lemmas results in the following 
construction, parameterized by shingle length c and a family of universal hash functions H = {SDif (^^) 
{0, iy}x£X, where / is equal to the input entropy m minus the entropy loss above. 

Construction 8 (Fuzzy Extractor for Edit Distance). 
To compute Gen^w) for \ w\ = n: 

1. Compute SHc(w;) by computing n — c + 1 shingles (fi, f2, ■ ■ ■ , Un-c+i) and removing duplicates to 
form the shingle set v from w. 

2. Compute s = syn(xt,) as in Construction |6] 

3. Select a hash function Hx G H and output {R = Hx{v),P = {s, x)). 
To compute Rep(w;', {s, x)): 

1. Compute SHc{w') as above to get v'. 

2. Use Rec(i'', s) from in Construction [6] to recover v. 

3. Output R = Hx{v). 

We thus obtain the following theorem. 

Theorem 7.4. For any n,m,c and < e < 1, there is an efficient average-case (Editjp(n), m, m — 
log2(n - c + 1) - (2c - l)t riog(F'= + 1)1-2 log (i) + 2, t, e)-Juzzy extractor 

Note that the choice of c is a parameter; by ignoring [•] and replacing n — c + 1 with n, 2c — 1 with 2c 
and F'^ + 1 with F^, we get that the minimum entropy loss occurs near 



n log n 



1/3 



4t logF ^ 

and is about 2.38 (t log F) (n log nf'^ (2.38 is really 1/ ^/2). In particular, if the original string has 
a linear amount of entropy 0{n log F), then we can tolerate t = Vl{n log^ F/ log^ n) insertions and deletions 
while extracting ^(nlogF) — 2 log (^) bits. The number of bits extracted is linear; if the string length n is 
polynomial in the alphabet size F, then the number of errors tolerated is linear also. 

Secure Sketches. Observe that the proof of Lemma 173] actually demonstrates that our biometric em- 
bedding based on shingling is an embedding with recovery information Qc- Observe also that it is easy to 
reconstruct w from SHc(if) and gdw). Finally, note that PinSketch (Construction is an average-case 
secure sketch (as are all secure sketches in this work). Thus, combining Theorem 1 6 . 3 1 with Lemma [4771 we 
obtain the following theorem. 
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Construction 9 (Secure Sketch for Edit Distance). For SS{w), compute v = SHc(w) and 
in Construction m Compute S2 = gc{w), writing each pj as a string of [logn] bits. Output s = (si, S2). 
For Rec{w' , (si, S2)), recover v as in Construction [H sort it in alphabetical order, and recover w by stringing 
along elements of v according to indices in 52- 

Theorem 7.5. For any n,m,c and < e < 1, there is an efficient average-case (Editjp(n), m, m — 
1 log2(n - c + 1) - (2c - l)t [log(F'= + 1)] , t) secure sketch. 

The discussion about optimal values of c from above applies equally here. 

Remark 1. In our definitions of secure sketches and fuzzy extractors, we required the original w and the 
(potentially) modified w' to come from the same space M.. This requirement was for simplicity of exposi- 
tion. We can allow w' to come from a larger set, as long as distance from w is well-defined. In the case of 
edit distance, for instance, w' can be shorter or longer than w, all the above results will apply as long as it is 
still within t insertions and deletions. 

8 Probabilistic Notions of Correctness 

The error model considered so far in this work is very strong: we required that secure sketches and fuzzy 
extractors accept every secret w' within distance t of the original input w, with no probability of eiTor. 

Such a stringent model is useful as it makes no assumptions on either the exact stochastic properties of 
the error process or the adversary's computational limits. However, Lemma ICT] shows that secure sketches 
(and fuzzy extractors) correcting t errors can only be as "good" as error-correcting codes with minimum 
distance 2t + 1. By slightly relaxing the coiTcctness condition, we will see that one can tolerate many more 
enws. For example, there is no good code which can correct n/4 eiTors in the binary Hamming metric: 
by the Plotkin bound (see, e.g., MSudOli Lecture 8]) a code with minimum distance greater than n/2 has at 
most 2n codewords. Thus, there is no secure sketch with residual entropy m' > log n which can correct 
n/4 errors with probability 1. However, with the relaxed notions of correctness below, one can tolerate 
arbitrarily close to n/2 eiTors, i.e., correct n(^ — 7) errors for any constant 7 > 0, and still have residual 
entropy J7(n). 

In this section, we discuss three relaxed enor models and show how the constructions of the previous 
sections can be modified to gain greater error-correction in these models. We will focus on secure sketches 
for the binary Hamming metric. The same constructions yield fuzzy extractors (by Lemma |4~T] ). Many of 
the observations here also apply to metrics other than Hamming. 

A common point is that we will require only that the a corrupted input w' be recovered with probability at 
least 1 — a < 1 (the probability space varies). We describe each model in terms of the additional assumptions 
made on the error process. We describe constructions for each model in the subsequent sections. 

Random Errors. Assume there is a known distribution on the errors which occur in the data. For the 
Hamming metric, the most common distribution is the binary symmetric channel BSCp. each bit of 
the input is flipped with probability p and left untouched with probability 1 — p. We require that for 
any input w, Rec{W', SS(u;)) = w with probability at least 1 — a over the coins of SS and over W 
drawn applying the noise distribution to w. 

In that case, one can correct an error rate up to Shannon's bound on noisy channel coding. This bound 
is tight. Unfortunately, the assumption of a known noise process is too strong for most applications: 
there is no reason to believe we understand the exact distribution on enors which occur in complex 
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data such as biometrics^ However, it provides a useful baseline by which to measure results for other 
models. 

Input-dependent Errors. The errors are adversarial, subject only to the conditions that (a) the error mag- 
nitude dis(?i;, w') is bounded to a maximum of t, and (b) the coiTupted word depends only on the input 
w, and not on the secure sketch SS(?i;). Here we require that for any pair w, w' at distance at most t, 
we have Rec(«;', SS(i(;)) = w with probabiUty at least 1 — a over the coins of SS. 

This model encompasses any complex noise process which has been observed to never introduce more 
than t errors. Unlike the assumption of a particular distribution on the noise, the bound on magnitude 
can be checked experimentally. Perhaps surprisingly, in this model we can tolerate just as large an 
error rate as in the model of random errors. That is, we can tolerate an error rate up to Shannon's 
coding bound and no more. 

Computationally bounded Errors. The errors ai^e adversarial and may depend on both w and the publicly 
stored information SS(ii;). However, we assume that the errors are introduced by a process of bounded 
computational power. That is, there is a probabilistic circuit of polynomial size (in the length n) which 
computes w' from w. The adversary cannot, for example, forge a digital signature and base the error 
pattern on the signature. 

It is not clear whether this model allows correcting errors up to the Shannon bound, as in the two mod- 
els above. The question is related to open questions on the construction of efficiently list-decodable 
codes. However, when the error rate is either very high or very low, then the appropriate hst-decodable 
codes exist and we can indeed match the Shannon bound. 

Analogues for Noisy Channels and the Hamming Metric. Models analogous to the ones 
above have been studied in the literature on codes for noisy binary channels (with the Hamming met- 
ric). Random errors and computationally bounded enws both make obvious sense in the coding con- 
text IISha48[|MPSW05l . The second model — input-dependent errors — does not immediately make sense 
in a coding situation, since there is no data other than the transmitted codeword on which errors could de- 
pend. Nonetheless, there is a natural, analogous model for noisy channels: one can allow the sender and 
receiver to share either (1) common, secret random coins (see IIDGL04I ILan04l and references therein) or 
(2) a side channel with which they can communicate a small number of noise-free, secret bits IIGur03i 

Existing results on these three models for the Hamming metric can be transported to our context using 
the code-offset construction: 

SS{w;x) =w® C{x) . 

Roughly, any code which corrects errors in the models above will lead to a secure sketch (resp. fuzzy 
extractor) which corrects errors in the model. We explore the consequences for each of the three models in 
the next sections. 

8.1 Random Errors 

The random error model was famously considered by Shannon IISha48l . He showed that for any discrete, 
memoryless channel, the rate at which information can be reliably transmitted is characterized by the maxi- 
mum mutual information between the inputs and outputs of the channel. For the binary symmetric channel 

'^Since the assumption here plays a role only in correctness, it is still more reasonable than assuming that we know exact 
distributions on the data in proofs of secrecy. However, in both cases, we would like to enlarge the class of distributions for which 
we can provably satisfy the definition of security. 
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with crossover probability p, this means that there exist codes encoding k bits into n bits, tolerating error 
probability p in each bit if and only if 

k 

- < 1 - h(p) - 6(n) , 
n 

where h{p) = —plogp — {I — p) log(l — p) and 6{n) = o(l). Computationally efficient codes achieving 
this bound were found later, most notably by Forney IIFor66ll . We can use the code-offset construction 
SS(t(;; x) = w (B C{x) with an appropriate concatenated code IIFor66ll or, equivalently, SS(u;) = synQ^w) 
since the codes can be hnear. We obtain: 

Proposition 8.1. For any error rate < p < 1/2 and constant 5 > 0, for large enough n there exist secure 
sketches with entropy loss {h{p) + 5)n, which correct the error rate of p in the data with high probability 
(roughly 2~^^^for a constant cs > 0). 

The probability here is taken over the errors only (the distribution on input strings w can be arbitrary). 

The quantity h{p) is less than 1 for any p in the range (0, 1/2). In particular-, one can get nontrivial 
secure sketches even for a very high error rate p as long as it is less than 1/2; in contrast, no secure sketch 
which corrects errors with probability 1 can tolerate t > n/4. Note that several other works on biometric 
cryptosystems consider the model of randomized enws and obtain similar results, though the analyses 
assume that the distribution on inputs is uniform IITG041 ICZ04II . 

A Matching Impossibility Result. The bound above is tight. The matching impossibility result also 
applies to input-dependent and computationally bounded errors, since random en^ors are a special case of 
both more complex models. 

We start with an intuitive argument: If a secure sketch allows recovering from random errors with high 
probability, then it must contain enough information about w to describe the error pattern (since given w' and 
SS(i(;), one can recover the en^or pattern with high probability). Describing the outcome of 7i independent 
coin flips with probability p of heads requires nh{p) bits, and so the sketch must reveal nh{p) bits about w. 

In fact, that argument simply shows that nh(p) bits of Shannon information are leaked about w, whereas 
we are concerned with min-entropy loss as defined in Section |3] To make the argument more formal, let W 
be uniform over {0, 1}" and observe that with high probability over the output of the sketching algorithm, 
V = SS{w), the conditional distribution = W\ss{w)=v forms a good code for the binary symmetric 
channel. That is, for most values v, if we sample a random string w from M^|ss(iy)=i> ^^'^ ^^^'^ it thi^ough a 
binary symmetric channel, we will be able to recover the correct value w. That means there exists some v 
such that both (a) is a good code and (b) tloo{Wv) is close to Hoo(VF|SS(VF)). Shannon's noisy coding 
theorem says that such a code can have entropy at most n(l — h{p) + o(l)). Thus the construction above is 
optimal: 

Proposition 8.2. For any error rate < p < 1/2, any secure sketch SS which corrects random errors (with 
rate p) with probability at least 2/3 has entropy loss at least n{h{p) — o(l)); that is, Hoo(VF|SS(VK)) < 
n(l — h{p) — o(l)) when W is drawn uniformly from {0, 1}". 

8.2 Randomizing Input-dependent Errors 

Assuming errors distributed randomly according to a known distribution seems very limiting. In the Ham- 
ming metric, one can construct a secure sketch which achieves the same result as with random errors for 
every error process where the magnitude of the error is bounded, as long as the eiTors are independent of 
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the output of SS(VF). The same technique was used previously by Bennett et al. IIBBR88[ p. 216] and, in a 
slightly different context, Lipton |Lip94[lDGL04ll . 



The idea is to choose a random permutation vr : [n] — > [n] , permute the bits of w before applying the 
sketch, and store the permutation vr along with SS{tt{'w)). Specifically, let C be a linear- code tolerating a p 
fraction of random eiTors with redundancy n — k ^ nh{p). Let 

SS{w;tt) = (vr, sync.(7r(w))) , 

where tt : [n] — > [n] is a random permutation and, for w = wi - ■ ■ Wn & {0, 1}", ■k{w) denotes the permuted 
string u'7r(i)if7r(2) • ■ ■ ^7r(n)- The recovery algorithm operates in the obvious way: it first permutes the input 
w' according to vr and then runs the usual syndrome recovery algorithm to recover 'k{w). 

For any particular pair w,w', the difference w ® w' will be mapped to a random vector of the same 
weight by tt, and any code for the binary symmetric channel (with rate p ^ t/n) will con^ect such an error 
with high probability. 

Thus we can construct a sketch with entropy loss n{h{t/n) — o(l)) which corrects any t flipped bits 
with high probability. This is optimal by the lower bound for random errors (Proposition 18. 2I ). since a 
sketch for data-dependent errors will also con^ect random errors. It is also possible to reduce the amount of 
randomness, so that the size of the sketch meets the same optimal bound IISmi07ll . 

An alternative approach to input-dependent errors is discussed in the last paragraph of Section [ 



8.3 Handling Computationally Bounded Errors Via List Decoding 

As mentioned above, many results on noisy coding for other error models in Hamming space extend to 
secure sketches. The previous sections discussed random, and randomized, errors. In this section, we 
discuss constructions IIGur031 ILan04[ IMPSWOSll which transform a list-decodable code, defined below, 
into uniquely decodable codes for a particular error model. These transformations can also be used in the 
setting of secure sketches, leading to better tolerance of computationally bounded errors. For some ranges 
of parameters, this yields optimal sketches, that is, sketches which meet the Shannon bound on the fraction 
of tolerated errors. 

List-Decodable Codes. A code C in a metric space M is called list-decodable with list size L and 
distance t if for every point x ^ M, there are at most L codewords within distance tofJ\4. A list-decoding 
algorithm takes as input a word x and returns the corresponding list ci , C2 , . . . of codewords. The most 
interesting setting is when L is a small polynomial (in the description size log|A^|), and there exists an 
efficient list-decoding algorithm. It is then feasible for an algorithm to go over each word in the list and 
accept if it has some desirable property. There are many examples of such codes for the Hamming space; 
for a survey see Guruswami's thesis MGurOll . Recently there has been significant progress in constructing 
list-decodable codes for large alphabets, e.g., IIPV051IGR06I . 

Similarly, we can define a list-decodable secure sketch with size L and distance t as follows: for any pair 
of words w,w' ^ J\A at distance at most t, the algorithm Rec{w\ SS{w)) returns a list of at most L points 
in Al; if dis(it;, w') < t, then one of the words in the list must be w itself. The simplest way to obtain a 
list-decodable secure sketch is to use the code-offset construction of Section [5] with a list-decodable code for 
the Hamming space. One obtains a different example by running the improved Juels-Sudan scheme for set 
difference (Construction [5]l, replacing ordinary decoding of Reed-Solomon codes with list decoding. This 
yields a significant improvement in the number of errors tolerated at the price of returning a list of possible 
candidates for the original secret. 
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Sieving the List. Given a list-decodable secure sketch SS, all that's needed is to store some additional in- 
formation which allows the receiver to disambiguate w from the list. Let's suggestively name the additional 
information Tag{w, R), where R is some additional randomness (perhaps a key). Given a list-decodable 
code C, the sketch will typically look like 

SS(t(;; x) = ( u; © C(x), Tag{w) ) . 

On inputs w' and (A, tag), the recovery algorithm consists of running the list-decoding algorithm on © A 
to obtain a list of possible codewords C(xi), . . . , C(xl). There is a con^esponding list of candidate inputs 
wi, . . . ,wl, where Wi = C{xi) © A, and the algorithm outputs the first Wi in the list such that Tag{wi) = 
tag. We will choose the function TagQ so that the adversary can not arrange to have two values in the list 
with valid tags. 

We consider two Tag{) functions, inspired by IIGur031 lLan04l IMPS W05l . 

1. Recall that for computationally bounded errors, the corrupted string w' depends on both w and SS(t(;), 
but w' is computed by a probabilistic circuit of size polynomial in n. 

Consider Tag{w) = \\3sh{w), where hash is drawn from a collision-resistant function family. More 
specifically, we will use some extra randomness r to choose a key key for a colUsion-resistant hash 
family. The output of the sketch is then 

SS{w;x,r) = {w®C{x), key{r), hashkey(r){w) ). 

If the list-decoding algorithm for the code C runs in polynomial time, then the adversary succeeds 
only if he can find a value Wi ^ w such that \\3s\\key{wi) = \\diS\\key{w), that is, only by finding a 
collision for the hash function. By assumption, a polynomially bounded adversary succeeds only with 
negligible probabihty. 

The additional entropy loss, beyond that of the code-offset part of the sketch, is bounded above by the 
output length of the hash function. If a is the desired bound on the adversary's success probability, 
then for standard assumptions on hash functions this loss will be polynomial in log(l/a). 

In principle this transformation can yield sketches which achieve the optimal entropy loss n{h{t/n) — 
0(1)), since codes with polynomial list size L are known to exist for error rates approaching the 
Shannon bound. However, in order to use the construction the code must also be equipped with a 
reasonably efficient algorithm for finding such a list. This is necessary both so that recovery will be 
efficient and, more subtly, for the proof of security to go through (that way we can assume that the 
polynomial-time adversary knows the list of words generated during the recovery procedure). We do 
not know of efficient (i.e., polynomial-time constructible and decodable) binary list-decodable codes 
which meet the Shannon bound for all choices of parameters. However, when the eiTor rate is near ^ 
such codes ai^e known MGSOOI . Thus, this type of construction yields essentially optimal sketches when 
the error rate is near 1/2. This is quite similar to analogous results on channel coding IIMPSW05II . 
Relatively little is known about the performance of efficiently list-decodable codes in other parameter 
ranges for binary alphabets MGurOll . 

2. A similar, even simpler, transformation can be used in the setting of input-dependent errors (i.e., 
when the errors depend only on the input and not on the sketch, but the adversary is not assumed 
to be computationally bounded). One can store Tag{w) = (/, hi{w)), where {hi}-^j comes from a 
universal hash family mapping from to {0, 1}^, where i = log (^) +log L and a is the probability 
of an incorrect decoding. 
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The proof is simple: the values wi, . . . ,wl do not depend on /, and so for any value Wi ^ w, 
the probability that hj{wi) = hj{w) is 2~^. There are at most L possible candidates, and so the 
probability that any one of the elements in the list is accepted is at most L ■ = a The additional 
entropy loss incuiTed is at most I = log (^) + log(L). 

In principle, this transformation can do as well as the randomization approach of the previous section. 
However, we do not know of efficient binary list-decodable codes meeting the Shannon bound for 
most parameter ranges. Thus, in general, randomizing the errors (as in the previous section) works 
better in the input-dependent setting. 

9 Secure Sketches and Efficient Information Reconciliation 

Suppose Alice holds a set w and Bob holds a set w' that are close to each other. They wish to reconcile the 
sets: to discover the symmetric difference wAw' so that they can take whatever appropriate (application- 
dependent) action to make their two sets agree. Moreover, they wish to do this communication-efficiently, 
without having to transmit entire sets to each other. This problem is known as set reconciUation and naturally 
arises in various settings. 

Let (SS, Rec) be a secure sketch for set difference that can handle distance up to t; furthermore, suppose 
that |i(;Ai(;'| < t. Then if Bob receives s = SS(u;) from Alice, he will be able to recover w, and therefore 
wAw', from s and w'. Similarly, Alice will be able find wAw' upon receiving s' = SS(w;') from Bob. 
This will be communication-efficient if |s| is small. Note that our secure sketches for set difference of 
Sections ld!2] and I6.3l are indeed short — in fact, they are secure precisely because they are short. Thus, they 
also make good set reconciliation schemes. 

Conversely, a good (single-message) set reconciliation scheme makes a good secure sketch: simply 
make the message the sketch. The entropy loss will be at most the length of the message, which is short 
in a communication-efficient scheme. Thus, the set reconciliation scheme CPISync of IIMTZ03II makes a 
good secure sketch. In fact, it is quite similar to the secure sketch of Section [6!2l except instead of the top t 
coefficients of the characteristic polynomial it uses the values of the polynomial at t points. 

PinSketch of Section 16.31 when used for set reconciliation, achieves the same parameters as CPISync 
of IIMTZ03II . except decoding is faster, because instead of spending time to solve a system of linear equa- 
tions, it spends t"^ time for Euclid's algorithm. Thus, it can be substituted wherever CPISync is used, such 
as PDA synchronization IISTA03II and PGP key server updates IIMin04ll . Furthermore, optimizations that 
improve computational complexity of CPISync through the use of interaction IIMT02II can also be applied 
to PinSketch. 

Of course, secure sketches for other metrics are similarly related to information reconciliation for those 
metrics. In particular, ideas for edit distance very similar to ours were independently considered in the 
context of information reconciliation by IICT04II . 
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A Proof of Lemma 2.2 



Recall that Lemma [Z2] considered random variables A, B, C and consisted of two parts, which we prove 
one after the other. 

Pait(a) stated that for any 5 > 0, the conditional entropy tloo{A\B = b) isatleast Hoo(^|i?) — log(l/5) 
with probability at least 1 — 6 (the probability here is taken over the choice of h). Let p = 2~^°°(^l^) = 
Efo [2~"°°(^l'^=*)] . By the Markov inequality, 2^Hoo(A|B=fe) < ^^^^ probability at least 1-5. Taking 
logarithms, part (a) follows. 

Part (b) stated that if B has at most 2^ possible values, then Hoo(^ I {B, C)) > Hoo((^, B)\C)-\> 
Hoo(^ \C)- X. In particular, Hoo(A | B) > Uoo{{A,B)) - A > Hoo(A) - A. Cleai'ly, it suffices to 
prove the first assertion (the second follows from taking C to be constant). Moreover, the second inequality 
of the first assertion follows from the fact that Pr[^ = a A = 6 | C = c] < Pr[A = a | C = c], for any c. 
Thus, we prove only that Hoo(^ I {B, C)) > Hoo((^, B) \ C) - A: 
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Hoo(^ I {B, C)) = - logE(fe,,)^(B,c) [maxPr[A = a\B = bAC = c] 

= - log V maxPrW = a\B = bAC = c] FilB = fe A C 



{b,c) 



log max PrU = a A B = b \ C = c]Fv[C = c] 

{b,c) 

log^Ec^c maxPr[^ = aAB = b\C = c] 



> 



logJ^E,. 



-c 



max Pr[A = aAB = b'\C = c] 



a,b' 



log5^2-H-«^'^)l^) > _iog2^2-^i^«^'^)l^) = Uoo{{A,B) I C) - A. 



The first inequality in the above derivation holds since taking the maximum over all pairs (a, b') (instead of 
over pairs (a, b) where b is fixed) increases the terms of the sum and hence decreases the negative log of the 
sum. 



B On Smooth Variants of Average Min-Entropy and the Relationship to 
Smooth Renyi Entropy 

Min-entropy is a rather fragile measure: a single high-probability element can ruin the min-entropy of an 
otherwise good distribution. This is often circumvented within proofs by considering a distribution which is 
close to the distribution of interest, but which has higher entropy. Renner and Wolf IIRW04II systematized this 
approach with the notion of e-smooth min-entropy (they use the term "Renyi entropy of order oo" instead of 
"min-entropy"), which considers all distributions that are e-close: 

H^(A)= max Hoo(S). 

Smooth min-entropy very closely relates to the amount of extractable nearly uniform randomness: if one 
can map ^4 to a distribution that is e-close to Um, then H^(A) > ?n; conversely, from any A such that 

(A) > m, and for any €2, one can extract m— 2 log bits that are e+e2-close to uniform (see IIRW04I 
for a more precise statement; the proof of the first statement follows by considering the inverse map, and 
the proof of the second from the leftover hash lemma, which is discussed in more detail in Lemma |2!4l ). For 
some distributions, considering the smooth min-entropy will improve the number and quality of extractable 
random bits. 

A smooth version of average min-entropy can also be considered, defined as 
Hl^{A\B)= max Hoo(C|D). 

(C,D): SD((yl,S),(C,D))<£ 

It similarly relates very closely to the number of extractable bits that look nearly uniform to the adversary 
who knows the value of B, and is therefore perhaps a better measure for the quality of a secure sketch that 
is used to obtain a fuzzy extractor. All our results can be cast in terms of smooth entropies throughout. 
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with appropriate modifications (if input entropy is e-smooth, then output entropy will also be e-smooth, 
and extracted random strings will be e further away from uniform). We avoid doing so for simplicity of 
exposition. However, for some input distributions, particularly ones with few elements of relatively high 
probability, this will improve the result by giving more secure sketches or longer-output fuzzy extractors. 

Finally, a word is in order on the relation of average min-entropy to conditional min-entropy, introduced 
by Renner and Wolf in MRWOSI . and defined as Hoo(^ | B) = - log max^^fe Pr(A = a \ B = b) = 
miiift Hoo(^ 1-6 = 6) (an e-smooth version is defined analogously by considering all distributions (C, D) 
that are within e of {A, B) and taking the maximum among them). This definition is too strict: it takes 
the worst-case b, while for randomness extraction (and many other settings, such as predictability by an 
adversary), average-case b suffices. Average min-entropy leads to more extractable bits. Nevertheless, after 
smoothing the two notions are equivalent up to an additive log (i) term: H^(^ | B) > H^(^ | B) 



and Hoo^"^^^(^ | B) > H^(^ | - log ( (for the case of e = 0, this follows by constructing 



a new distribution that eliminates all b for which Hoo(^ \ B = b) < Hoo(^ | B) — log ( which 



will be within €2 of the {A, B) by Markov's inequality; for e > 0, an analogous proof works). Note that 



by Lemma [2!2l b). this implies a simple chain rule for (a more general one is given in ||RW05[ Section 
2.4]): Hoo'+''(A I B) > H^((^,5)) - Ho{B) - log (f), where Ho{B) is the logarithm of the number 



of possible values of B. 

C Lower Bounds from Coding 

Recall that an {M,K, t) code is a subset of the metric space M. which can correct t errors (this is slightly 
different from the usual notation of coding theory Uterature). 

Let K{M., t) be the largest K for which there exists an (A^, K, f)-code. Given any set S of 2™ points 
in M, we let K{M,t, S) be the largest K such that there exists an {M,K, t)-code all of whose K points 
belong to S. Finally, we let L{M.,t, m) = log(min|5|=2'" I^in^i t, S)). Of course, when m = log we 
get L{A4,t, n) = log K{M.,t). The exact determination of quantities K{M.^t) and K{M^,t, S) is a central 
problem of coding theory and is typically very hard. To the best of our knowledge, the quantity L{A4,t, m) 
was not explicitly studied in any of three metrics that we study, and its exact determination seems hard as 



We give two simple lower bounds on the entropy loss (one for secure sketches, the other for fuzzy extrac- 
tors) which show that our constructions for the Hamming and set difference metrics output as much entropy 
m' as possible when the original input distribution is uniform. In particular, because the constructions have 
the same entropy loss regardless of m, they are optimal in terms of the entropy loss m — m' . We conjecture 
that the constructions also have the highest possible value m' for all values of m, but we do not have a good 
enough understanding of L{M,t, m) (where J\A is the Hamming metric) to substantiate the conjecture. 

Lemma C.l. The existence of an {A4,m, m' ,t) secure sketch implies that m' < L{M,t, m). In particular, 
when m = log \M\ (i.e., when the password is truly uniform), m' < \ogK[M,t). 

Proof. Assume SS is such a secure sketch. Let S be any set of size 2™ in , and let W be uniform over 
5. Then we must have Hoo(VF | SS(VF)) > m' . In particular, there must be some value v such that 
Hc<,(VF I SS(VF) = v) > m' . But this means that conditioned on SS(VF) = v, there are at least 2"^ points 
w'mS (caU this set T) which could produce SS(VF) = v. We claim that these 2™ values of w form a code 
of error-correcting distance t. Indeed, otherwise there would be a point w' ^ M. such that dis(?/;o, w') < t 
and dis(tt;i, w') < t for some wq, wi £ T. But then we must have that Rec{w' ,v) is equal to both wq and 






well. 
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wi, which is impossible. Thus, the set T above must form an (Ai, 2"^', t)-code inside S, which means that 
m' < log K{M,t, S). Since S was arbitrary, the bound follows. □ 

Lemma C.2. The existence of{M,m, i, t, e)-fuzzy extractors implies that i < L{A4,t, m) — log(l — e). In 
particular, when m = log \^A \ (i.e., when the password is truly uniform), (. < log K{^A^t) — log(l — e). 

Proof. Assume (Gen, Rep) is such a fuzzy extractor. Let S be any set of size 2"^ in Al, let W be uniform 
over S and let {R,P) ^ Gen(VF). Then we must have SD {{R,P), {Ue,P)) < e. In particular, there 
must be some value p of P such that R is e-close to Ui conditioned on P = p. In particular, this means 
that conditioned on P = p, there are at least (1 — e)2^ points r G {0, 1}^ (call this set T) which could be 
extracted with P = p. Now, map every r G T to some arbitrary w £ S which could have produced r with 
nonzero probability given P = p, and call this map C. C must define a code with error-correcting distance 
t by the same reasoning as in Lemma ICTl □ 

Observe that, as long as e < 1/2, we have < — log(l — e) < 1, so the lower bounds on secure sketches 
and fuzzy extractors differ by less than a bit. 



D Analysis of the Original Juels-Sudan Construction 

In this section we present a new analysis for the Juels-Sudan secure sketch for set difference. We will assume 
that n = is a prime power and work over the field F = GF{n). On input set w, the original Juels-Sudan 
sketch is a list of r pairs of points {xi,yi) in JT, for some parameter r, s < r < n. It is computed as follows: 

Construction 10 (Original Juels-Sudan Secure Sketch IIJS06II ). 

Input: a set li; C JF of size s and parameters r € {s + 1, . . . , n} , t G {1, . . . , s} 

1. Choose p{) at random from the set of polynomials of degree at most k = s — t — 1 over JF. 
Write w = {xi, . . . , Xs}, and let t/i = p{xi) for i = 1, . . . , s. 

2. Choose r — s distinct points x^+i, . . . , x,. at random from T — w. 

3. For i = s + 1, . . . , r, choose y-i £ T at random such that yi / p{xi). 

4. Output SS{w) = {(xi, yi), . . . , (xr, yr)} (in lexicographic order of Xj). 

The parameter t measures the eiTor-tolerance of the scheme: given SS{w) and a set w' such that 
wAw' < t, one can recover w by considering the pairs {xi,yi) for Xi £ w' and running Reed-Solomon 
decoding to recover the low-degree polynomial p{-). When the parameter r is very small, the scheme 
corrects approximately twice as many errors with good probability (in the "input-dependent" sense from 
Section [8]l. When r is low, however, we show here that the bound on the entropy loss becomes very weak. 

The parameter r dictates the amount of storage necessary, one on hand, and also the security of the 
scheme (that is, for r = s the scheme leaks all information and for larger and larger r there is less information 
about w). Juels and Sudan actually propose two analyses for the scheme. First, they analyze the case where 
the secret w is distributed uniformly over all subsets of size s. Second, they provide an analysis of a 
nonuniform password distribution, but only for the case r = n (that is, their analysis applies only in the 
small universe setting, where r2(n) storage is acceptable). Here we give a simpler analysis which handles 
nonuniformity and any r < n. We get the same results for a broader set of parameters. 

Lemma D.l. The entropy loss of the Juels-Sudan scheme is at most t log n + log (") — log ("Z*) + 2. 
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Proof. This is a simple application of Lemma IZill b). Hoo((Wi SS(l^))) can be computed as follows. 
Choosing the polynomial p (which can be uniquely recovered from w and SS(u')) requires s — t random 
choices from J^. The choice of the remaining Xj's requires log ("l^) bits, and choosing the y-s requires 
r-s random choices fmm F - {p{xi)]. Thus, Hoo((l^, SS(W^))) = Hoo(VF) + {s-t) log n + log (^I^) + 
(r — s) log(n — 1). The output can be described in log {i^n^) bits. The result follows by Lemma [Z2l b) 
after observing that (r — s) log < n log < 2. □ 

In the large universe setting, we will have r <^ n (since we wish to have storage polynomial in s). In 
that setting, the bound on the entropy loss of the Juels-Sudan scheme is in fact very large. We can rewrite 
the entropy loss as t log n - log (^) + log (") + 2, using the identity (") = (") ("2^) • Now the entropy 
of W is at most (") , and so our lower bound on the remaining entropy is (log — t log n — 2). To make 
this quantity lai^ge requires making r very lai'ge. 

E BCH Syndrome Decoding in Sublinear Time 

We show that the standai^d decoding algorithm for BCH codes can be modified to run in time polynomial 
in the length of the syndrome. This works for BCH codes over any field GF{q), which include Hamming 
codes in the binary case and Reed-Solomon for the case n = q — 1. BCH codes are handled in detail in 
many textbooks (e.g., ||vL92|| ); our presentation here is quite terse. For simplicity, we discuss only primitive, 
narrow-sense BCH codes here; the discussion extends easily to the general case. 

The algorithm discussed here has been revised due to an eiTor pointed out by Ari Trachtenberg. Its 
implementation is available IIHJR06II . 

We'll use a slightly nonstandard formulation of BCH codes. Let n = (7™ — 1 (in the binary case of 
interest in Section [631 q = 2). We will work in two finite fields: GF{q) and a larger extension field 
T = GF{q"^). BCH codewords, formally defined below, are then vectors in GF{q)"-. In most common 
presentations, one indexes the n positions of these vectors by discrete logarithms of the elements of JF*: 
position i, for 1 < i < n, corresponds to a', where a generates the multiplicative group J^*. However, there 
is no inherent reason to do so: they can be indexed by elements of JT directly rather than by their discrete 
logarithms. Thus, we say that a word has value at position x, where x T*. If one ever needs to write 
down the entire n-character word in an ordered fashion, one can ai^bitraiily choose a convenient ordering of 
the elements of T (e.g., by using some standard binary representation of field elements); for our purposes 
this is not necessary, as we do not store entire n-bit words explicitly, but rather represent them by their 
supports: supp(f ) = {{x,px) \ Px / 0}. Note that for the binary case of interest in Section 1631 we can 
define supp(t;) = {x \ px 0}, because px can take only two values: or 1. 

Our choice of representation will be crucial for efficient decoding: in the more common representation, 
the last step of the decoding algorithm requires one to find the position i of the error from the field element 
a*. However, no efficient algorithms for computing the discrete logarithm are known if is large (indeed, 
a lot of cryptography is based on the assumption that such an efficient algorithm does not exist). In our 
representation, the field element a* will in fact be the position of the error. 

Definition 8. The (narrow-sense, primitive) BCH code of designed distance S over GF{q) (of length n> 5) 
is given by the set of vectors of the form (ca;)^gjp, such that each Cx is in the smaller field GF{q), and the 
vector satisfies the constraints Ylx&j^* ^'-^^^ = 0, for i = 1, . . . , (5 — 1, with arithmetic done in the lai^ger 
field J". 
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To explain this definition, let us fix a generator a of the multiplicative group of the large field T* . For 
any vector of coefficients {c^^^^j:, , we can define a polynomial 

where dlog(a;) is the discrete logarithm of x with respect to a. The conditions of the definition are then 
equivalent to the requirement (more commonly seen in presentations of BCH codes) that c(a*) = for 
i = l,...,6 -I, because (a^)^'°g(^) = (a^'°g(^))* = x\ 

We can simplify this somewhat. Because the coefficients Cx are in GF{q), they satisfy = Cx- Using 
the identity {x + yY = x^ + y'^, which holds even in the large field J^, we have c(a*)^ = J2xy^o c^a^*^ = 
c(a*^). Thus, roughly al/q fraction of the conditions in the definition are redundant: we need only to check 
that they hold for i G {1, . . . , 5 — 1} such that q /i. 

The syndrome of a word (not necessarily a codeword) {px)xeJ^* G GF{q)^ with respect to the BCH 
code above is the vector 

syn(p) = p{a^), . . . ,p{a^~^), where p{a') = ^ pxx\ 

As mentioned above, we do not in fact have to include the values p{a^) such that q\i. 

Computing with Low-Weight Words. A low-weight word p G GF{qY can be represented either as 
a long string or, more compactly, as a list of positions where it is nonzero and its values at those points. We 
call this representation the support list of p and denote it supp(p) = {(x, Px)}x-p^^Q- 

Lemma E.l. For a q-ary BCH code C of designed distance 5, one can compute: 

1. syn{p) from supp(p) in time polynomial in 6, logn, and |supp(p)|, and 

2. supp{p) from syn(p) (when p has weight at most (6 — 1) /2), in time polynomial in S and log n. 

Proof. Recall that syn(p) = (p(a), . . . ,p{a^^^)) where p{a^) = Ylx^oP^^^ ■ easy, since to 

compute the syndrome we need only to compute the powers of x. This requires about 5 ■ weight (p) multi- 
plications in J^. For Part (2), we adapt Berlekamp's BCH decoding algorithm, based on its presentation in 
llvL92l . Let M = {x e T*\px / 0}, and define 

a{z) =^ W{1- xz) and uj{z) =^ a{z) ^ ■ 

x£M x£M ^ ' 

Since (1 — xz) divides cj{z) for x G M, we see that uj{z) is in fact a polynomial of degree at most |M| = 
weight(p) < {5 — l)/2. The polynomials a{z) and u}{z) are known as the enw locator polynomial and 
evaluator polynomial, respectively; observe that gcd{a{z),Lo{z)) = 1. 

We will in fact work with our polynomials modulo z^ . In this arithmetic the inverse of (1 — xz) is 
ELiM^^^;thatis, 

5 

J 



(1 - xz) ^(xz)^ ^ = 1 mod z^ 



=1 



We are given p{a') for £ = 1,...,6. Let S{z) = Etlp(«')^'- Note that S{z) = J^x^^mPx^i^ 
mod z^. This implies that 

S{z)a{z) = io{z) mod z^ . 
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The polynomials a{z) and u!{z) satisfy the following four conditions: they are of degree at most {6—l)/2 
each, they are relatively prime, the constant coefficient of cj is 1 , and they satisfy this congruence. In fact, 
let w'{z),a'{z) be any nonzero solution to this congruence, where degrees of w'{z) and a-'{z) are at most 
{6 — l)/2. Then w'{z)/a'{z) = Lo{z)/a{z). (To see why this is so, multiply the initial congruence by a'{) 
to get u!{z)a'{z) = a{z)u!'{z) mod z^. Since both sides of the congruence have degree at most 6—1, 
they are in fact equal as polynomials.) Thus, there is at most one solution a{z),uj{z) satisfying all four 
conditions, which can be obtained from any a'{z),uj'{z) by reducing the resulting fraction uj' (z) / a' {z) to 
obtain the solution of minimal degree with the constant term of a equal to 1. 

Finally, the roots of a{z) are the points for x E M, and the exact value of px can be recovered from 
uj{x~^) = Px Yly^M yjtxi^ ~ y^~^) (this is needed only for q > 2, because for q = 2, px = 1). Note that 
it is possible that a solution to the congruence will be found even if the input syndrome is not a syndrome 
of any p with weight(p) > {5 — 1) /2 (it is also possible that a solution to the congruence will not be found 
at all, or that the resulting a{z) will not split into distinct nonzero roots). Such a solution will not give 
the con^ect p. Thus, if there is no guarantee that weight(p) is actually at most {5 — l)/2, it is necessary to 
recompute syn(p) after finding the solution, in order to verify that p is indeed correct. 

Representing coefficients of (t'(z) and uj'{z) as unknowns, we see that solving the congruence requires 
only solving a system of 6 linear equations (one for each degree of z, from to 6— 1) involving 6+1 variables 
over T, which can be done in 0{6^) operations in T using, e.g., Gaussian elimination. The reduction of the 
fraction Lu'{z)/a'{z) requires simply running Euclid's algorithm for finding the g.c.d. of two polynomials of 
degree less than 6, which takes 0{5^) operations in JT. Suppose the resulting a has degree e. Then one can 
find the roots of a as follows. First test that a indeed has e distinct roots by testing that (t(z)|z^'" — z (this 
is a necessary and sufficient condition, because every element of ^ is a root of z^™ — z exactly once). This 
can be done by computing (z^™ mod a-{z)) and testing if it equals z mod a; it takes m exponentiations of a 
polynomial to the power q, i.e., 0((m log g)e^) operations in J^. Then apply an equal-degree-factorization 
algorithm (e.g., as described in llShoOSII ). which also takes 0((m log g)e^) operations in JT. Finally, after 
taking inverses of the roots of T and finding px (which takes O(e^) operations in T), recompute syn(p) to 
verify that it is equal to the input value. 

Because m log q = log(n + 1) and e < {6 — 1) /2, the total running time is 0{6^ + (5^ log n) operations 
in JT; each operation in can done in time 0(log^ n), or faster using advanced techniques. 

One can improve this running time substantially. The error locator polynomial cj() can be found in 
0(log5) convolutions (multiplications) of polynomials over T of degree {6 — l)/2 each IIBla83[ Section 
1 1 .7] by exploiting the special structure of the system of linear equations being solved. Each convolution can 
be performed asymptotically in time 0{6 log 6 log log 5) (see, e.g., llvzGG03l ). and the total time required 
to find a gets reduced to 0{6 log^ 5 log log 5) operation in JT. This replaces the 5^ term in the above running 
time. 

While this is asymptotically very good, Euclidean-algorithm-based decoding IISKHN75I . which runs 
in 0(6'^) operations in J^, will find a{z) faster for reasonable values of 6 (certainly for 6 < 1000). The 
algorithm finds a as follows: 

set Roidiz)^z^~\ R,^,{z)^ S{z)/z, VoXa{z)^Q, V,^,{z)^l. 
while deg(i?cur(-z)) >{5-l)/2: 

divide Ro\d{z) by Rcur{z) to get quotient q{z) and remainder iincw(-z); 

set Vnc^{z) ^ Vo\d{z) - q{z)Vcnr{z)} 

set Ro\d{z) ^ Rcnr{z), Rcxir{z) ^ Rncw{z), V^oldl^) ^ Vcnr{z), ycur(^;) ^ Kcw(2;) • 

set c ^ V^cur(O); set a{z) ^ Vcur{z)/c and uj{z) ^ z ■ Rcut{z)/c 
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In the above algorithm, if c = 0, then the con^ect a{z) does not exist, i.e., weight(p) > (5 — l)/2. The 
correctness of this algorithm can be seen by observing that the congruence S{z)a{z) = oj{z) (mod z^) can 
have z factored out of it (because S{z), u>{z) and z^ are all divisible by z) and rewritten as {S{z)/z)a{z) + 
u{z)z^''^ = uj{z)/ z, for some u{z). The obtained a is easily shown to be the coiTcct one (if one exists at all) 
by applying UShoOSl Theorem 18.7] (to use the notation of that theorem, set n = z^~^,y = S{z)/z,t* = 
r* = {d- l)/2,r' = oj{z)/z,s' = u{z),t' = a{z)). 

The root finding of cr can also be sped up. Asymptotically, detecting if a polynomial over !F = 
GF{q"^) = GF{n + 1) of degree e has e distinct roots and finding these roots can be performed in 
time 0{e^'^^^ (logn)^-^^'^ ) operations in JT using the algorithm of Kaltofen and Shoup IIKS95I . or in time 
0(e^ + (log n)e lege log lege) operations in JT using the EDF algorithm of Cantor and ZassenhauJ*^ 
For reasonable values of e, the Cantor-Zassenhaus EDF algorithm with Karatsuba's multiplication algo- 
rithm IIKQ63II for polynomials will be faster, giving root-finding running time of 0(e^ + e'°S2 ^ log n) oper- 
ations in T. Note that if the actual weight e of p is close to the maximum tolerated {6 — l)/2, then finding 
the roots of a will actually take longer than finding a. □ 

A Dual View of the Algorithm. Readers may be used to seeing a different, evaluation-based formu- 
lation of BCH codes, in which codewords are generated as follows. Let T again be an extension of GF{q), 
and let n be the length of the code (note that \ J^*\ is not necessarily equal to n in this formulation). Fix 
distinct xi,X2, ■ ■ ■ ,Xn G J-'. For every polynomial c over the large field T of degree at most n — S, the 
vector (c(xi), c{x2), ■ ■ ■ c{xn)) is a codeword if and only if every coordinate of the vector happens to be in 
the smaller field: c{xi) G GF{q) for all i. In particular, when = GF{q), then every polynomial leads to 
a codeword, thus giving Reed-Solomon codes. 

The syndrome in this formulation can be computed as follows: given a vector y = (yi, ?/2, • • • , Vn) 
find the interpolating polynomial P = pn-ix^~^ + pn-2x"'~'^ + • • • + Po over JT of degree at most n — 1 
such that P{xi) = yi for all i. The syndrome is then the negative top 5 — 1 coefficients of P: syn(y) = 
{—pn-i, —pn-2, ■ ■ ■ , —Pn-{5-i))- (It is c^sy to scc that this is a syndrome: it is a Unear function that is zero 
exactly on the codewords.) 

When n = — 1, we can index the n-component vectors by elements of J^*, writing codewords as 
{c{x))x£j^* ■ In this case, the syndrome of {yx)xeJ^* defined as the negative top 6 — 1 coefficients of P such 
that for all x £ F*, P{x) = y^ is equal to the syndrome defined following Definition [8] as Ylxi^j^y'-^^^ 
i = 1, 2, . . . , 5 — 1. Thus, when n = \T\ — 1, the codewords obtained via the evaluation-based definition 
are identical to the codewords obtain via Definition [H because codewords are simply elements with the zero 
syndrome, and the syndrome maps agree. 

This is an example of a remarkable duality between evaluations of polynomials and their coefficients: 
the syndrome can be viewed either as the evaluation of a polynomial whose coefficients are given by the 
vector, or as the coefficients of the polynomial whose evaluations are given by a vector 

The syndrome decoding algorithm above has a natural interpretation in the evaluation-based view. Our 
presentation is an adaptation of Welch-Berlekamp decoding as presented in, e.g., MSudOll Chapter 10]. 

'^See IShoOSI Section 21.3], and substitute the most efficient known polynomial arithmetic. For example, the procedures de- 
scribed in lvzGG03l take time 0(e log e log log e) instead of time 0{e^) to perform modular arithmetic operations with degree-e 
polynomials. 

This statement can be shown as follows: because both maps are linear, it is sufficient to prove that they agree on a vector 
{yx)x£T* such that ya = 1 for some a £ J-* and = for x 7^ a. For such a vector, "^^xst V^^^ ~ other hand, 

the interpolating polynomial P{x) such that P{x) — y^ is —ax"~^ — a^x"~^ _ . . . _ a'^~^x — 1 (indeed, P{a) — ~n — 1; 
furthermore, multiplying P{x) by a: — a gives a{x"' — 1), which is zero on all of JT*; hence P{x) is zero for every x 7^ a). 
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Suppose n = |F| — 1 and xi, . . . , ai^e the nonzero elements of the field. Let y = {yi,y2, • • • , Vn) be 
a vector. We are given its syndrome syn(y) = (-p„-i, -Pn-2, • • • , -Pn-{6-i)), where . . . ,Pn-[5-i) 
are the top coefficients of the interpolating polynomial P. Knowing only syn(y), we need to find at most 
{5 — l)/2 locations Xi such that connecting all the coiTcsponding yi will result in a codeword. Suppose that 
codeword is given by a degree- (?i — 5) polynomial c. Note that c agrees with P on all but the error locations. 
Let p{z) be the polynomial of degree at most (5 — l)/2 whose roots are exactly the error locations. (Note 
that a{z) from the decoding algorithm above is the same p{z) but with coefficients in reverse order, because 
the roots of a are the inverses of the roots of p.) Then p{z) ■ P{z) = p{z) ■ c{z) for z = xi, X2, • • • , Xn- 
Since xi, . . . , x„ are all the nonzero field elements, HILil^ ~ = -z" — 1- Thus, 

n 

p{z) ■ c{z) = p{z) ■ P{z) mod W{z - Xi) = p{z) ■ P{z) mod (z" - 1) . 

i=l 

If we write the left-hand side as a„_ix"~^ + a„_2x"~^ + • • • + ao, then the above equation implies 
that a„_i = • • • = a„_(5_i)/2 = (because the degree if p(z) ■ c{z) is at most n — {5 + l)/2). Because 
a„_i, . . . , an-[s-i)/2 depend on the coefficients of p as well as on Pn-i, • • ■ ,Pn-(<5-i)> but not on lower 
coefficients of P, we obtain a system of (5 — l)/2 equations for {5 — l)/2 unknown coefficients of p. A 
careful examination shows that it is essentially the same system as we had for a{z) in the algorithm above. 
The lowest-degree solution to this system is indeed the correct p, by the same argument which was used 
to prove the correctness of a in Lemma IeHI The roots of p aie the eiTor-locations. For q > 2, the actual 
corrections that are needed at the error locations (in other words, the light vector corresponding to the given 
syndrome) can then be recovered by solving the linear system of equations implied by the value of the 
syndrome. 
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